Plant Molecular Biology 30: 65-75, 1996. © 1996 Kluwer Academic Publishers. Printed in Belgium. 65 Higher-plant chloroplast and cytosolic 3-phosphoglycerate kinases: a case of endosymbiotic gene replacement Henner Brinkmann 1 and William Martin 2"* l Institutfiir Botanik, Technische Universitgit Braunschweig, Spielmannstrasse 7, D-38023 Braunschweig, Germany; 2Institut ffir Genetik, Technische Universitiit Braunschweig, Spielmannstrasse 7, D-38023 Braunschweig, Germany (* author for correspondence) Received 10 July 1995; accepted in revised form 25 September 1995 Key words: Archaebacteria, endosymbiosis, molecular evolution, isoenzymes, carbohydrate metabolism Abstract Previous studies indicated that plant nuclear genes for chloroplast and cytosolic isoenzymes of 3-phosphoglycerate kinase (PGK) arose through recombination between a preexisting gene of the eukaryotic host nucleus for the cytosolic enzyme and an endosymbiont-derived gene for the chloroplast enzyme. We readdressed the evolution of eukaryotic pgk genes through isolation and characterisation of a pgk gene from the extreme halophilic, photosynthetic archaebacterium Haloarcula vallismortis and analysis of PGK sequences from the three urkingdoms. A very high calculated net negative charge of 63 for PGK from H. vallismortis was found which is suggested to result from selection for enzyme solubility in this extremely halophilic cytosol. We refute the recombination hypothesis proposed for the origin of plant PGK isoenzymes. The data indicate that the ancestral gene from which contemporary homologues for the Calvin cycle/glycolyticisoenzymes in higher plants derive was acquired by the nucleus from (endosymbiotic) eubacteria. Gene duplication subsequent to separation of Chlamydomonas and land plant lineages gave rise to the contemporary genes for chloroplast and cytosolic PGK isoenzymes in higher plants, and resulted in replacement of the preexisting gene for PGK of the eukaryotic cytosol. Evidence suggesting a eubacterial origin of plant genes for PGK via endosymbiotic gene replacement indicates that plant nuclear genomes are more highly chimaeric, i.e. contain more genes of eubacterial origin, than is generally assumed. Abbreviations: PGK, 3-phosphoglycerate kinase; FBA, fructose-l,6-bisphosphate aldolase; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; TPI, triosephosphate isomerase. Introduction The evolutionary origin of chloroplast/cytosol enzymes in plants traditionally has been viewed in light of a working hypothesis known as the gene transfer corollary to endosymbiotic theory [49], which predicts that the genes for cytosolic isoenzymes were endogeneous to the nuclear genome The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number L47295. 66 at the time of chloroplast acquisition and that, in the process of chloroplast genome reduction, genes for chloroplast isoenzymes were transferred to the nucleus where they acquired a transit peptide for reimport into the organelle of their genetic origin. Calvin cycle and glycolysis possess a number of parallel enzymatic reactions in cytosol and stroma which are catalysed by distinct isoenzymes specific to each compartment, whereas homologues of the Calvin cycle and the oxidative pentose phosphate pathway appear only to occur in chloroplasts [42]. Molecular sequences for these genes can help to shed light on the evolutionary mechanisms which led to the biochemical partitioning now observed in contemporary photosynthetic eukaryotes. Recent molecular studies of chloroplast/cytosol isoenzymes of Calvin cycle and glycolytic pathways revealed, perhaps surprisingly, that the gene transfer corollary has not been substantiated for any isoenzyme pair studied in plants to date. It was gene duplication, not endosymbiotic gene transfer, which provided the origin of the chloroplast/cytosol isoenzymes for fructose-l,6-bisphosphate aldolase (FBA, duplication in early eukaryotic evolution) [36], triosephosphate isomerase (TPI, duplication in chlorophyte evolution) [21, 41 ], and glyceraldehyde-3phosphate dehydrogenase (GAPDH, duplication in eubacterial evolution) [32, 20]. In the initial report of molecular sequences for chloroplast and cytosolic 3-phosphoglycerate kinase from plants (PGK) [30] it was noted that the sequences are more similar to one another than one would expect under the gene transfer corollary. It was suggested that recombination had occurred between the respective isoenzyme genes in an attempt to explain the lack of expected similarity between plant and other eukaryotic cytosolic PGK. But, due to the lack at that time of sufficient reference sequences from prokaryotic and eukaryotic sources, an explicit hypothesis was not put forth for the origin of either gene in the context of either eukaryotic or eubacterial evolution. In a later study of PGK gene evolution, Vohra et al. [48] interpreted the phylogenetic results obtained as statistical support for the recombination model. Sequences from archaebacteria are neccessary for comparison to eubacterial and eukaryotic homologues in order to distinguish between different possibilities of PGK gene origin. Two PGK sequences from methanogenic archaebacteria exist in the database [ 13 ], but methanogens might not be representative of all archaebacteria concerning PGK evolution. Methanogens use PGK in hexose biosynthesis rather than energy metabolism [8, 9], whereas halophilic archaebacteria possess rhodopsin-based photosynthetic membranes and have been reported to possess ribulose-1,5-bisphosphate carboxylase activity [2, 38], a key Calvin cycle enzyme which methanogens lack altogether. It is thus conceivable that aspects of sugar phosphate metabolism in halophiles are more similar to that in photosynthetic eubacteria and eukaryotes, making them important links for understanding the evolution of genes for enzymes of sugar phosphate metabolism in general. Here we describe the cloning and structure of the PGK gene from the photosynthetic extreme halophile Haloarcula vallismortis and reevaluate PGK gene evolution as it relates to the origin of genes for plant chloroplast and cytosol PGK. Material and methods Culture conditions and nucleic acid isolation Haloarcula vallismortis type strain 3756 (ATCC 29715) was obtained from the Deutsche Sammlung ftlr Mikroorganismen (Braunschweig) and grown under shaking at 43 °C in the medium described [37]. Cells were grown to a n OD6o 0 of 3.0, harvested by centrifugation and lysed by suspension in 50 mM Tris-HC1, 50 mM EDTA pH 8.0. The lysate was purified by phenol/ chloroform extraction, nucleic acids were recovered by ethanol precipitation, DNA was purified by CsC1 centrifugation in a Ti70 rotor. 67 Isolation of a pgk gene from Haloarcula vallismortis A library was constructed in 2EMBL4 via Sau 3AI partial digestion and purification of 18-22 kb fragments in saccharose gradients, ligation and packaging as described [40]. Recombinants were plated on Escherichia coli K803 and screened by plaque hybridisation at 32 °C in hybridization buffer (6 × SSPE, 0.1 ~o SDS, 0.02~o PVP, 0.02~o Ficoll-400) with 10 ng/ml (5 × 10 6 cpm/ml) of an end-labelled 16-fold degenerate 16mer constructed against the conserved amino acid motif WYDNEY found in peptides sequenced from the H. vallismortis GAPDH enzyme [37]. Positives were purified from which the hybridizing and overlapping 3.0 kb Bss HII, 3.3 kb Pst I and 2.4 kb Sac II fragments were subcloned into pBluescript vectors (Stratagene). The region encompassing the gap and pgk genes was sequenced on both strands via Exo III deletion series and synthetic oligonucleotide primers using the dideoxy method. Other molecular methods were performed as described [40]. Sequence analysis Standard sequence analysis was performed with the GCG Package [10]. Sources of sequences used for analysis are given in the legend to Fig. 3. Sequences were aligned with clustal V [22], the alignment was refined by eye with lineup. After exclusion of positions not occupied by an amino acid in all sequences, 308 positions remained in the final alignment which was used for phylogenetic inference. Distance between sequences was measured as numbers of amino acid substitutions per site corrected for multiple substitutions by assuming a gamma distribution for the variability of sub stitution rate acro s s positions [24 ]. For this, a neighbour-joining tree [39] was constructed using the proportion of amino acid differences between PGK sequences, from which the gamma parameter (a) was estimated. The value of a thus determined (2.28) was used to estimate numbers of substitutions per site between sequences using the gamma correction. The resulting distance matrix yielded the final neighbour joining tree (calculated with a program kindly provided by M. Nei). The reliability of branches was estimated by bootstrapping (100 replicates) using the same gamma parameter and by bootstrap parsimony analysis using protpars of phylip [15]. Results and discussion H. Vallismortis possesses a gap-pgk gene cluster Ten phages which positively hydridized to the synthetic oligonucleotide probe were purified and shown by restriction mapping to overlap in the region encompassing the gap gene. Several overlapping fragments spanning the region were subcloned and sequenced. Due to the very high GC content ofH. vallismortis DNA, a large number of sequencing runs on both strands in addition to the use of deaza analogues were necessary before all bases had been determined unambiguously. The gap gene sequence (GenBank accession number L47295) was found to contain both of the oligopeptide sequences reported by Pr(iss et al. [37] with no discrepancies. The deduced amino acid sequence of the Haloarcula gap gene product shares roughly 50~o identical residues with GAPDH proteins from eubacteria and eukaryotes, but like eubacterial GAPDH, it shares only ca. 15~o identical residues with sequences from methanogenic archaebacteria [19] suggesting that, in analogy to class I and class II aldolases [36], type I and type II glutamine syntheases [27], or form I and form II Rubisco [7, 33, 47], archaebacteria may possess 'class I' and 'class II' GAPDH (manuscript in preparation). Downstream of the gap gene we found a 1203 bp open reading frame, the predicted product of which (Fig. 1) shows ~ 3 7 ~ identity to PGK from methanogenic archaebacteria and < 30~o identity to PGK from eukaryotes and eubacteria. Between the stop codon of the gap gene and the start codon of the pgk gene there is a 296 bp stretch which contains several terminator-like hairpin structures and palindromic sequences, the 68 . . . . . T~GCGC~AAACGGTTTGGGACGCCAGTAAAGTTTAT~TTATTATTCGCT~G~TAG 60 . . . . . 120 G~ACACATAGAGTCCG~CCAGCACCTGACCCCTGTTCGGCTCTGTGTG~GACACAGG . . . . . 180 CCCGTTTTCCCAGCCGGCGTGCTGTGTCCCGCGCCGGCTGTGTACGGCGCAGGTGTCCCC >>>>>>>>>>>>> <<<<<<<<<<<<< 240 ACCACTCTCCGACCCCATT~TTACGTTGGCAGC~GGCAGACCGCAGTGTTACCCCCAG 300 CAGGTCGCTCGCTGTTCGGACTC~AAACCGTT~GCGGGCGGCAGGTTTCCTCTCG~T M 360 GATGACTTTCCAGACGCTCGATGACCTCGACGATGGACAGCGCGTCCTCGTCCGACTCGA M T F Q T L D D L D D G Q R V L V R L D 420 CCTC~CTCACCGGTCGAGGACGGCACAGTACAGGAC~TCGACGGTTCGACCGCCACGC L N S P V E D G T V Q D N R R F D R H A 480 GGAGACTGTC~GG~CTCGCTGACC~GGGTTCGAGGTC~AGTGCTGGCCCATCAGGG E T V K E L A D R G F E V A V L A H Q G 540 CCGCCCGGGCCGCGACGACTTCGTCTCGCTCGACCAGCACGCCGACATCCTCGCCGACCA R P G R D D F V S L D Q H A D I L A D H 600 CATCGACCGCGACGTGGATTTCGTCGACGAGACCTACGGGCCACAG~GATTCACGATAT I D R D V D F V D E T Y G P Q A I H D I 660 C~CGACCTCGATAGTGGCGACGTGCTCGTTCTGGAG~CACGCGGATGTGTGACGACGA A D L D S G D V L V L E N T R M C D D E 720 ACTGCCCGAGG~GACCCCG~GTG~CCCAGACGGAGTTCGTC~CGCTGGCCGG L P E E D P E V K A Q T E F V K T L A G 780 GGAGTTCGACGCCTACATC~CGACGCCTACTCCGCG~CCACC~TCGCACGCCTCGCT E F D A Y I N D A Y S A A H R S H A S L 840 CGTGGGCTTCCCGCTCGTGAT~ATGCCTACGCC~TCGCGTGAT~AGACC~GTACGA V G F P L V M D A Y A G R V M E T E Y E 900 GGCC~CACCGCCATCGCCGAG~GGAGTTCGACGGGCAGGTGACGATGGTCGTCGGCGG A N T A I A E K E F D G Q V T M V V G G 960 GACG~GGCCACCGACGTCATCGACGTGATGACCCACTTAGAC~G~GGTCGACGACTT T K A T D V I D V M T H L D E K V D D F 1020 CTTGCTTGGCGGTATCGCG~CTGTTCCTGCGGCAGCC~CCACCCAGTCGGCTACGA L L G G I A G T V P A A A G H P V G Y D 1080 CATCGACGACGCGAACCTCTACGACGAGCAGTGGGAGGCAAACAGTGAG~GATC~GTC I D D A N L Y D E Q W E A N S E K I E S 1140 CAT~TCG~GACCACCGG~CCA~TTACCCTCGCGGTC~CCTG~CTAC~ACGA M L E D H R D Q I T L A V D L A Y E D E 1200 G~CGACGACCGCGCCGAGCAGGCCGTCGACGACATCGACGAGAAAAGGCTCTCGTACCT N D D R A E Q A V D D I D E K R L S Y L most prominent of which is indicated in the figure. Upstream of the palindrome is an AT-rich region which stands out due to the high GC content of the region sequenced (60 ~o overall, 85 ~o at third codon positions). A TATA-like motif is present 25 bp upstream of the pgk start codon. Some gene clusters, such as the tryptophan operon, are similarly organized in archaebacteria and eubacteria [28, 29]. A gap-pgk gene cluster is found in both archaebacterial and eubacterial genomes (Fig. 3), which suggests that this could represent a gene organisation which might have been present in the progenote. The gene cluster gap-pgk-tpi has been found in gram positive eubacteria and Thermotoga maritima, although in Thermotoga PGK and TPI are fused to a single bifunctiona170 kDa protein [44]. In two 7-purple bacteria, the fba gene for fructose-l,6-bisphosphate aldolase (Class II) [ 1] instead oftpi is found downstream ofpgk (Fig. 3). Although in E. coli the gap-pgk association is preserved, in the Haemophilus genome the pgk-fba cluster (accession number U32734) is separated from gap and tpi by over 500 and 100 kb, respectively, indicating that 'reshuffling' of these genes occurs at least occasionally in bacterial evolution. In accordance with that view is the finding that pgk in the thermoacidophilic archaebacterium Sulfolobus (not mentioned in [3]) is located next to a gap gene, but in pgk-gap order. Furthermore, the Sulfolobus gap gene encodes a Class II G A P D H enzyme (methanogen-like, see above), suggesting convergent pgk gene cluster organization in this thermoacidophile. 1260 CGACGT~GCAGCGAGACCCTCATGGAGTACTCGCCGATCATCCGCGAGTCCGAG~CGT D V G S E T L M E Y S P I I R E S E A V 1320 CTTCGGTG~G~CGCGCTGGGATGTTCGAAGACG~CGGTTCTC~TCGGGACTGCCGG F G E G R A G M F E D E R F S V G T A G 1380 TGTGCTCG~GCCATCGCCGACACCGACTGTTTCTCCGTCGTCGGCGGCGGCGACACCTC V L E A I A D T D C F S V V G G G D T S High net charge of PGK from an extreme halophile H. vallismortis grows well at 43 °C in 4-5 M NaCI, in our hands a doubling time of about 4 h 1440 AC~GCTATCGAGATGTAC~GATGG~G~GAC~GTTCGGCCACGTCTCCATCGCTGG R A I E M Y G M E E D E F G H V S I A G 1500 CGGGGCCTACATCCG~CGTTGACGCGCGCAC~CTGGTCGGCGTCG~GTCCTCCAGCG G A Y I R A L T R A Q L V G V E V L Q R Fig. 1. Sequence of the Haloarcula vallismortispgk gene in the gap-pgk cluster. An AT-rich region (underlined) and a con- 1560 CT~CTGCGGCCTTTCGTCTCGGCGTGTCGCAGGTCCCAGTCGTTGCGTTCGTCCGGG~ spicuous palindrome (>>> <<< ) are indicated. The putative TATA box is double underlined, the stop codon of the gap gene is indicated with three asterisks, that of the pgk gene with one asterisk. 1620 CGTCGACG~CCGCCGCGTCTCGGTCGACTCCGTCCGTGCGTGGCGGGCAGTCCAGTCGT GGTCAGCGTCGGGCGTGGCTGCCATACACTGCGTTACA 69 was observed. Enzymes of sugar phosphate metabolism from extreme halophilic archaebacteria require salt concentrations of 2 to 3 M in order to achieve full enzyme activity [2, 6, 11, 37], a concentration which would be sufficient to precipitate many proteins from non-halophilic organisms. Furthermore, intracellular salt concentrations in extreme halophiles are typically at least as high as those of the environment (ca. 3.5 M) [9], it is therefore perhaps not surprising that the deduced amino acid sequence of H. vallismortis P G K has 93 aspartate plus glutamate residues (D + E) in addition to 41 histidine, lysine plus arginine (H + K + R) residues for total of 134 out of 401 (33~o) highly charged residues and an overall predicted charge of -63 from the simple primary sequence. By comparison, E. coli P G K has an overall charge of -11 (55 D + E and 48 H + K + R), human P G K a charge of + 3 (50 D + E and 58 H + K + R), and P G K from the non-halophilic archaebacterium Methanothermus fervidus a charge of +5 (55 D + E and 68 H + K + R). The highly negative charge ofH. vallismortis P G K is, however, comparable to that previously reported for various cytoplasmic enzymes from extreme halophiles: superoxide dismutase, -32 [25]; NADP-specific glutamate dehydrogenase, -40 [4]; dnaK, - 102 [ 17 ]; enolase, -53 [26]. This charge is thus not the product of chance, but rather a common characteristic of cytoplasmic proteins from halophiles which likely serves to keep them in solution at high to saturating cellular salt concentrations. An alignment of phosphoglycerate kinase sequences from eukaryotic, eubacterial and archaebacterial genomes (Fig. 2) indicates that the distribution of charge in the H. vallismortis sequence is rather uniform, revealing no obvious concentration of either positively or negatively charged residues in any portion of the protein primary structure. Plant chloroplast/cytosol PGK isoenzymes: a gene duplication Based upon an expanded version of the alignment in Fig. 2, phylogenetic analysis of P G K sequences including chloroplast and cytosolic isoenzymes from plants provide an intruiging picture of gene evolution (Fig. 3). Both chloroplast and cytosolic P G K sequences from plants assume a very distinct position in the gene tree relative to pgk genes of non-photosynthetic eukaryotes. The cytosolic enzyme does not branch with its homologues from higher eukaryotes and very importantly, genes for both plant isoenzymes share a common branch which is located well below pgk genes from the kinetoplastid protists Trypanosoma and Crithidia with a quite high bootstrap percentage (94) (Fig. 3). This finding quite clearly indicates that neither genes for the cytosolic nor the chloroplast enzyme in plants surveyed (only chlorophytes) are orthologues of the cytosolic enzyme in nonphotosynthetic eukaryotes, since orthologues would be expected to branch well above kinetoplastid genes, close to animals and fungi. This unexpected pattern of similarity was perceived by Longstaff et al. [30], who put forth a 'recombination' hypothesis (as one of several alternatives) to explain the lack of expected similarity between wheat cytosolic P G K and its eukaryotic homologues. Under that model, the gene for the plant cytosolic enzyme is orthologous to its counterpert in animals and fungi whereas the gene for the chloroplast enzyme derives from the cyanobacteria-like antecedents of plastids, and subsequent recombination was suggested to have occurred between them resulting in genes with intermediate similarity to eubacterial and eukaryotic homologues. The recombination hypothesis was thus envoked under Weeden's [49] gene transfer corollary as a subordinate scenario to explain aspects of the data and was accepted in later studies of P G K evolution [14, 48]. However, recombination does not influence divergence between recornbining genes and therefore cannot account for the finding that the plant isoenzymes are more similar (82~o identity) to each other than to any eukaryotic or eubacterial homologue. The by far most straightforward interpretation of the data is that genes for the chloroplast and cytosolic isoenzymes ofphosphoglycerate kinase in higher plants arose through gene duplication in eukaryotic ge- 70 1 90 .... + + . . . . ++ -++ - +- -+ - + + +-- Haloarcula Methanobacterium Methanothermus MMTFQTLDD.LDDGQRVLVRLDLNSPVEDGT...VQDNRRFDRHAETVKELADRGFEVAVL.AHQGRPGRDDFV ................ MSLPFYTIDDF'NLEDKTVLVRVDINSPVDPST.GSILDDTKIKLHAETIDEISKKGAKTVVL.AHQSRPGKKDFT ................ MFKFYTMDDF.DYSGSRVLVRVDINSPVDPHT.GRILDDTRMRLHSKTLKELVDENAKVAIL.AHQSRPGKRDFT ................ Human SLSNKLTLDKLDVXGKRVVMRVDFNVPMKNNQ...ITNNQRIKAAVPSIKFCLDNGAKSVVLMSHLGRPDGVPMPD ............... Aspergl I Ius MSLTSKLSITDVDLKDKRVLIRVDFNVPLDKNDNTTITNPQRIVGALpTIKYAIDNGAKAVILMSHLGRpDGKKNp ................ Saccharomyces SLSSKLSVQDLDLKDKRVFIRVDFNVPLDGKX.--ITSNQRIVAALPTIKIrVLEHHpRYVVLASHLGRPNG.ERNE ............... T. brucei c y t MSLKERKSINECDLKGKKVLIRVDFNVPLDDGN...ITNDYRIRSALPAVQKVLTEGGS.CVI~MSHLGRPKGVSMAEGKELRSTGGIPGFEQ MTLNEKKSINECDLKGKKvLIRVDFNVP9q4~GK...ITNDYRIRSALPTLKKVLTEGGS.CVLMSHLGRPKGIPMAQAGKIRSTGGVPGFQQ T. brucei g l y c MNKKTLKDIDVKGKRVFCRVDFNVPMKDGK.--VTDETRIRAAIPTIQYLVEQGAK.VILASHLGRPK-GEVVE ............... Bacillus meg. ............... Corynebacterium M A V K T L K D L L D E G V D G R H V I V R S D F N V P L N D D R . . E I T D K G R I I A S L P T L K A L S E G G A K . V I V M A H L G R . . Q G E V N E MSVIKMTDLDLAGKRVFIRADLNVPVKDGK.--VTSDARIRASLPTIELALKQGAK-VMVTSHLGRPTEGEYNE ............... Escherichia MRTLLDLDPKGKRVLVRVDYNVPVQDGK--.VQDETRILESLPTLRHLLAGGAS.LVLLSHLGRPKGPDP ................. Thermus Zymomonas MAFRTLDDIGDVKGKRVLVREDLNVPMDGDR..-VTDDTRLRAAIPTVNELAEKGAK.VLILAHFGRPKGQPNP ................ Triticum MATKRSVGTLGEADLKGKKVFVRADLNVPLDDAQ-.KITDDTRIRASIPTIKYLLEKGAK.VILASHLGRPKGVTP ................. cyt chl <<MAKKSVGDLTSADLKGKKVFVRADLNVPLDDSQ..NITDDTRIRAAIPTIKHLINNGAK.VILSSHLGRPKGVTP ................. Spinacia chl <<MAKKSVGDLTAADLEGKRVLVRADLNVPLDDNQ..NITDDTRIRAAIPTIKYLLSNGAK-VILTSHLGRPKGVTP ................. Trl ti cum * * * * * , * 180 - Haloarcula Me than oba c teri um Methanothermus Human Aspergillus Saccharomyces T. brucei c y t T. brucei g l y c Bacillus meg. Coryn ebac teri um Escherichia Thermus Zymomonas Triticum cyt Spinacia chl Triticum chl + -+ - -+- +. . . . ++ Haloarcula Methanobacterium Methanothermus Human Aspergillus Saccharomyces T. brucei c y t T. brucei g l y c Bacillus meg. Cory~ebacterium Escherichia Thermus Zymomonas Triticum cyt Spinacia chl Triticum chl . . + . . . . . . . + - + - _ _ •S L D Q H A D I L A D H I D R D V D F V D E T Y G P Q A I H D I A D L D S G D V L V L E N T R M C D D E L ........ PEEDPEVKAQTEFVKTLAGEFDAYIND • •T L Q Q H A K A L S N I L N R P V D Y I D D I F G T A A R E E I K R L K K G D I L L L E N V R F Y P E E I ........ LKRDPHQQAETHMVRKLYPI IDIFIND • -T M E E H S K I r L S N I L D M P V T Y V E D I F G C A A R E S I R N M E N G D I ILLENVRFYSEEV ........ LKRDPKVQAETHLVRKLSSVVDYYIND KYS LE PVAVELKSLLGKDVLFLKDCVGPEVEKACANpAAG SV ILLENLRFHVEEEGKGKDASGNKVKAE PAKI EAFRAS L S KLGDVYVND KYS LK PVVPKLKELLGRDV IFTEDCVGPEVEETVNKASGGQVI LLI~LRFHAEEEG S SKDADGNKVKADKDAVAQFRKGLTALGDI T IND KYS LAPVAKELQ SLLGKDVTFLNDCVGPEg-EAAVKASAPGSVI L L E N L R Y H I E E E G S R K V -D G Q ~ S K E D V Q K F R H E L S S LADVY IND K A T L K P V A K A L S E L L S R P V T F A P D C L - •N A A D W S KM S PGDVVLLENVRFYKEE ........ GSKSTEEREAMAKIAKAFAELADVYVND K A T L K P V A K A L S E L L L R P V T F A P D C L . •N A A D W S KMS PGDVVLLENVRFYKEE ........ GSKKAKDREAMAKI • • -L A S Y G D ~ F / I S D ELRLNAVAERLQALLGKDVAKADEAFGEEVKKT I DGMS EGDVLVLENVRFYPGEE ........... KNDPEL ....... LASYGDVYISD KYS LAPVAEAL S DELGQYVALAADVVGEDAHERANGL T EGD I LLLENVRFDpRET S ........ KDEAERNRFAQELAALAADNGAFVS D EF S LL PVVNYLKDKL SNPVRLVKDYL ...... DGVDVAEGELVVLENVRFN ............... KGEKKDDETL S KKYAALC DVFVMD KYS LAPVGEALRAHL PEARFAPFP PGS EEARREAEALRPGEVLLLENVRFEPGEE ........... KNDPE .... LSARYARLGEAFVLD EMS LAR IKDALAGVLGRPVHF IND IKGEAAAKAVDALNPGAVALLENTRFYAGEE ........... KNDPA .... LAAEVAKLGDFYVND KF S LKPLVARL SELLGLEVVMAPDC IGEEVEKLAAAL PDGGVLLLENVRFYKEEE ........... ~NDPE .... FAKKLASVADLYVND KFSLAPLVPRLSELLGLQVVKADDC IG P D V E K L V A E L P E G G V L L L E N V R F Y K E E E ........... KNDpE .... FAKKLASLADLYVND KF S LAPLVPRL S ELLG IEVKKAEDVI G PEVEKLVADLANGAVLL LENVRFYKEEE ........... KNDPE .... FAKKLASLADLFVND • • + - + . . . . *** +- • • - + - - + --+ 270 + __ AYSAAHRSHASLVGFPLvM.DAYAGRVMETEYEANTAIAEKEFDGQVTMVvGGTKATDVIDvMTHLDEKvDDFLLGGIAGTvPAAAGHPV AFAAAHRSQPSLVGFAVKL.PSGAGRIMEKELKSLYGAVDNA.EKP~DSIMVLENVLRN.GSADYVLTTGL~ANIFLWAS AFAAAHRSQPSLVGFPLKL.PSAAGRLMEREVKTLYKIIKNV.EKPCVYILGGVKIDDSIMIMKNILKN.GSADYILTSGLVANVFLEAS AFGTAHRAHSsMVGVNLPQ..KAGGFLMKKELNYFAKALESP.ERPFLAILGGAKVADKIQLINNMLDK...VNEMIIGGGMAFTFLKVL AFGTAHRAHSSMVGVDLPQ..KASGFLVKKELEYFAKALEEP.QRPFLAILGGSKVSDKIQLIDNLLPK...VNSLIITGGMAFTFKKTL AFGTAHRAHsSMVGFDLPQ..RAAGFLLEKELKYFGKA~ENP.TRPFLAILGGAKVADKIQLIDNLLDK...VDSIIIGGGMAFTFKKVL AFGTAHRDSATMTGIPKILGHGAAGYLMEKEISYFAKVLGNP.PRPLVAIVGGAKVSDKIQLLDNMLQR..•IDYLLIGGAMAYTFLKAQ AFGTAHRDSATMTGIPKILGNGAAGYLMEKEISYFAKVLGNP.PRPFTAIVGGAKVF~DKIGVIDHLLDK...VDNLIIGGGLSYTFIKAL AFGAAHRAHASTEGIAQHI.PAVAGFLMEKELDVLSKALSNP.ERPLVAIVGGAKVSDKIQLLI~WMLQR...IDYLLIGGAMAYTFLKAQ GFGVVHRAQTSVYDIAKLL.PHYAGGLVETEISVLEKIAESP.EAPYVVVLGGSKVSDKIGVIEALAAK...ADKIIVGGG~CYTFLAAQ AFGTAHRAQASTHGIGKFADVACAGPLLAAELDALGKALKEP.ARPMVAIVGGSKVSTKLTVLDSLSKI...ADQLIVGGGIANTFIAAQ AFGSAHRAHASVVGVARLL.PAYAGFLMEKEVRALSRLL••DP.ERPYAVVLGGAKVSDKIGVIESLLPR...IDRLLIGGAMAFTFLKAL AFSAAHRAHVSTEGLAHKL.PAFAGRAMQKELEALEAALGKP.THPVAAwGGAKVSTKLDVLTNLVSK...VDHLIIGGGMANTFLAAQ AFGTAHRAHASTEGVTKFLRPSVAGFLMQKELDYLVGAVANP.KKPFAAIVGGSKVSSKIGVIESLLAK...VDILILGGGMIFTFYKAQ AFGTAHRAHAST•GVTKFLKPSVAGFLLQKELDYLVGAVSNP.KRPFAAIVGGSKVSSKIGVIESLLEK...CDILLLGGGMIFTFYKAQ AFGTAHRAHASTEGVTKFLKPSVAGFLLQKELDYLDGAVSNP.KRPFAAIVGGSKVSSKIGVIESLLEK...CDILLLGGGMIFTFYKAQ ** * * ** * * 360 . Haloarcula Me thanobac teri um Methanothermus Httn~n Aspergi i I us Saccharomyces T. brucei c y t T. brucei g l y c Bacillus meg. Corynebacterium Escherichia Thermus Zymomonas Triticum cyt Spinacia chl Triticum chl . . . . . . + - --++ . . . . . . . + . . . . . ++ - _ _ +- - G •Y D I D D A N L Y D E Q W E A N S E K I ESMLEDHRDQI TLAVDLAYEDENDDRAEQAV . . . . D D I D E K R L S -Y L D V G S E T L M E Y S P I IRESEAVF G. IN ..... LGKYNEDFI INKGYIDFVEKGKQLLEEFDGQI ~PDDVAVCVDNARVEYCTKNI PNKPIYDIGTNT I -T E Y A K F I R D A K T I G- ID . . . . . I K E K N R K I L Y R K N Y K K F IKMAKKLKDKYGEKI LTPVDVAINE~GKRIDVP IDDI PNFPIYDIGMET I •K I Y A E K I R E A K T I N N M E I G T S • LFDEEGAKIVKDLMSKAEKNGVK I TL PVDFVTADKFDENAKTGQATVASG I PAGWMG. -LDCG PES SKKYAEAVTRAKQ IV E N V K I G S S •L F D E A G S K I V G N I I EKAKKHNVKVVL PVDYVTADKFAADAKTGYATDEQ G I P D G Y M G - •L D V G E K S V E S Y K Q T I A E S K T I L E N T E I G D S -I F D K A G A E I V P K L M E K A K A K G V E V V L P V D F I I A D A F S A D A N T K T V T D K E G I P A G W Q G • •L D N G P E S R K L F A A T V A K A K T IV G - Y S I G I S •M C E E S K L E F A R S L L K K A E D R K V Q V I L P I D H V C H T E F K A % r D S P L I T E D •Q N I P E G H M A - -LDIGPKTIEKYVQTIGKCKSAI G - YSIGKS - KCEESKLEFARSLLKKAEDRKVQVILPIDHVCHTEFKAVDS PLITED .QNI PEGHMA . -LDIGPKT IEKYVQT IGKCKSAI G • HEVGKS - LLEEDK I ELAKS FMEKAKKNGVNFYMPVDVVVADDF SNDANI QVVS I • EDIPSDWEG • •LDAG PKTRE IYADVI KNSKLVI G - H N V Q Q S •L L Q E E M K A T C T D L L .... ASVDK IVL PVDLVAASEFNKDAEKQ I V D L - D S I P E G W M S - -L D I G P E S V K N F G E V L S T A K T IF G •H D V G K S -L Y E A D L V D E A K R L L ..... T TCN I PVP SDVRVATEF SETAPATLKSV •N D V K A D E Q I • - L D I G D A S A Q E L A E I LKNAKT i L G •G E V G R S - L V E E D R L D L A K D L L G R A E A L G V R V Y L P E D V V A A E R I E A G V E T R V F P A •R A I P V P Y M G • • L D I G P K T R E A F A R A L E G A R T V F G •V D V G K S • L C E H E L K D T V K G I F A A A E K T G C K I HL PSDVVVAKEFKANPP I R T I P V . S D V A A D E M I • •L D V G P K A V A A L T E V L K A S K T L V G -L A V G K S •L V E E D K L E L A T S L I E T A K S K G V K L L L P T D V V V A D K F A A D A E S K 7 V P A •T A I P D G W M G • • L D V G P D S I K T F A E A L D T T K T V I G •M S V G S S • L V E E D K L D L A T S LLAKAKEKGVS LLL PT DVVIADKFAADADSKI V P A • S G I P D G W M G • •L D I G P D S I K T F S E A L D T T Q T V I G -L S V G S S -L V E E D K L E L A T S LLAKAKAKGVS LLL P SDVI IADKFAPDANSQTVPAS A I P D G W M G - oL D I G P D S V K T F N D A L D T T Q T I I 450 - Haloarcula Methanobacterium Methanothermus Human Aspergillus Saccharomyces T. brucei c y t T. brucei g l y c Bacillus meg. Corynebacterium Escherichia Thermus Zymomonas Triticum cyt Spinacia chl Tritlcum chl + ---+ . . . . + . . . . . + + + - + G •E G R A G M F E D E R F SVGTAGVLEA TADTDC .... F SVVGGGDT SRAI EMYGMEEDEFGHVS IAGGAY IRALTRAQ LVGVEVLQR FANGPAGVFEQEGFS IGTEDI LNT IASSNG .... YSI IGGGHLAA QMGLSSG. ITH I SSGGGAS INLLAGEKLPVVE I LTEVKMKGRK FANGPAGVFEEQQFS IGTEDLLNAIASSNA .... FSV IAGGHLAAAAEKMG I S N K •I N H I S S G G G A C 7 A F L S G E E L P A I K V L E E A R K R S D K Y I W -N G P V G V F E W E A F A R G T K A L M D E V V K A T S R •G C I T I I G G G D T A T C C A K W N T E D K •V S H V S T G G G A S L E L L E G K V L P G V D A L S N I L W -N G P P G V F E M E P F A K A T K A T L D A A V A A V Q N •G A T V I I G G G D T A T V A A K Y G A E D K -I S H V S T G G G A S L E L L E G K E L PGVAAL SEKS K W' NGP PGVF E F EKFAAGTKALLDEVVKS S A A •G N T V T I G G G D T A T V A K K Y G V T D K •I S H V S T G G G A S L E L L E G K E L PGVAFL SEKK W •N G P M G V F E M V P Y S K G T F A I A K A M G R G T H E H G L M S I IGGGESAGAAELCGEAAR •I S H V S T G G G A S L E L L E G K T L PGVTIrLDEKE W -N G P M G V F E M V P Y S K G T FA IAKAMGRGTHEHGLMS I IGGGDSASAAELSGFJkKR -M S H V S T G G G A S L E L L E G K T L PGVTVLDEKSAVVS >> W -N G P M G V F E L D A F A N G T K A V A E A L A E A T D • • •T Y S V I G G G D S A A A V E K F N L A D K •M S H I S T G G G A S L E F M E G K E L P G V V A L N D K W -N G P M G V F E F A A F S E G T R A S P R P S S M Q H A G N D A F S V V G G G D S A A S V R V L G L N E D G F S H I STGGGASLEYLEGKEL PGVAILAQ W •N G P V G V F E F P N F R K G T E I V A N A I A D S E . . . . A F S I A C ~ _ ~ 3 D T L A A I D L F G I A D K •I S Y I S T G G G A F L E F V E G K V L PAVAMLE ERAKK W -N G P M G V F E V P P F D E G T L A V G Q A I A A L E • • •G A F T V V G G G D S V A A V N R L G L K E R •F G H V S T G G G A S L E F L E K G T L P G L E V L E G W -N G P L G A F E I E P F D K A T V A L A K E A A A L T KAGS L I SVAGGGDTVAALNHAGVAKD •F S F V S T A G G A F L E W M E G K E L PGVKALEA W -N G P M G V F E F E K F A A G T D A IAKQLAELTGK •G V T T I I G G G D S V A A V E K A G L A D K •M S H I S T G G G A S L E L L E G K P L P G V L A L D E A W -N G P M G V F E F E K F A A G T E A I A K K L E E I S K K •G A T T I I G G G D S V A A V E K V G V A E A -M S H I S T G G G A S L E L L E G Q T A S W S T C S W -N G P M G V F E F D K F A V G T E S I A K K L A E L S K K -G V T T I I G G G D S V A A V E K V G V A D V -M S H I S T G G G A S L E L L E G K E L P G V V A L D E G V M T R S V T V . ** ** • *** 71 nomes. That view is strongly supported by the position of the Chlamydomonas sequence, which branches robustly with other plant sequences but below the node corresponding to the duplication event which gave rise to the chloroplast and cytosolic isoenzymes of wheat (and presumably of other higher plants). The position of the spinach chloroplast P G K sequence suggests that the duplication occurred before the separation ofmonocots and dicots. Gene duplications which resulted in enzymes of different eukaryotic cell compartments is a recurring theme of P G K evolution, as suggested by the portion of the tree beating the kinetoplastid sequences (Fig. 3). It is evident that genes for recompartmentalized P G K enzymes destined for the glycosome, a specific glycolytic microbody present in some but not all kinetoplastids [18], have arisen several times independently in kinetoplastid evolution. The three P G K genes of T. brucei and T. congolense are organized in tandem [34, 35], the first in the array (pgkA) carries a large insertion of roughly 80 amino acids not found in other P G K enzymes. The resulting product is of higher molecular weight (ca. 56 kDa) and is located in glycosomes [35], although it is not the major glycosomal form in T. brucei [34]. The topogenic signals involved in glycosomal precursor import are quite different from those for chloroplasts [34, 46], but for both plants and protists, the genes for differentially compartmentalized P G K isoenzymes arose through duplication (Fig. 3). ing that it represents an orthologue of other archaebacterial P G K genes (the partial Sulfolobus P G K sequence also branches with these homologues, data not shown). The elevated rate of amino acid substitution for the H. vallismortis sequences implied by the length of its terminal branch very likely relates to the roughly 40 additional acidic amino acids that this sequence contains relative to its homologues from nonhalophilic organisms. An important aspect of Fig. 3 is that eubacterial and eukaryotic P G K genes are much more similar to each other (ca. 50~o identity) than either is to archaebacterial sequences (only ca. 30~o amino acid identity). This is in marked contrast to the majority of protein coding genes which have been studied in the three urkingdoms; they depict a different urkingdom topology, showing greater similarity between archaebacterial and eukaryotic genes relative to eubacterial homologues [5, 12, 16, 23, 51, 52]. This discrepancy in P G K gene evolution was a finding upon which, at least in part, Zillig's hypothesis for the origin of the eukaryotic genome via the fusion of archaebacterial and eubacterial genomes was originally based [51, 52] (see [12] for a review). Due to the high degree of divergence between archaebacterial and remaining P G K sequences, the position of the archaebacterial branch may be a long-branch artefact, as suggested by low bootstrap values in basal portion of the tree. Eubacterial origin of chlorophyte PGK genes Comparison of PGK sequences from the three urkingdoms H. vallismortis P G K is more similar to P G K from two methanogenic archaebacteria (Fig. 3) than it is to eubacterial or eukaryotic sequences, suggest- Note that the portion of the P G K tree for nonphotosynthetic eukaryotes is consistent with other molecular phylogenies regarding the branching order of trypanosomes, Plasmodium, animals and fungi. Note that if one disregards for a moment all eukaryotic sequences in the figure (i.e. by sim- Fig. 2. Alignmentof PGK sequences.Sourcesof sequencesare as givenin the legendto Fig. 3. Strictlyconservedresiduesin the alignmentare indicatedwith an asterisk. Positivelyand negativelychargedamino acids in the Haloarculavallismortissequenceare indicated above the alignmentwith '+' and ' - ' respectively.Gaps are indicated as dots. cyt, cytosolic;chl, chloroplast; glyc, glycosome. ' >>' indicates that the C-terminusof the T. bruceiglycosomalsequence(YASAGTGTLSNRWSSL)is not shown. ' <<' indicates the presenceof a transit peptide. Position numberingis arbitrary. 72 3-Phosphoglycerate Kinase Gene Tree s3/9 l°rJ~--J°I II ss ~uu ]l , , , Species and Gene Eukaryotea: Enzyme Compartment Prokaryotes: Gene Cluster Human Mouse Human (TS) Chicken Drosophila Schistosoma Tetrahymena cytosol cytosol cytosol cytosol cytosol cytosol cytosol Aspergillus Penicillium Trichoderma reesei Trichoderma viridis Neurospora Yarrowia Rhizopus pgk l Rhizopus pgk2 Kluyveromyces Saccharomyces Candida Plasmodium cytosol cytosol cytosol cytosol cytosol cytosol cytosol cytosol cytosol cytosol cytosol cytosol E U 100 100 92 68 r Y O t i C N U e I e a 4t"Eukaryotic gene duplications which resulted in recompartmentallzedenzymes 991 6 7 ~ 100 100 + l 100 100 k a ,1o i 100 :_t1 loo I 100 I loo 100 100 loo loo ] I ~ Crithidia fasciculata Crithidia fasciculata Trypanosoma brucei Trypanosoma brucei Trypanosoma brucei (56 kD) Trypanosoma congolense Trypanosoma congolense (56 kD) -- Crithidia fasciculata (56 kD) cytosol glycosome cytosol glycosome glycosome cytosol glycosome glycosome Spinach Wheat Wheat Chlamydomonas reinhardii (low GC g+) Bacillus megaterium (low GC g+) Bacillus stearothermophilus chloroplast chloroplast cytosol chloroplast gap-pgk-tpi gap-pgk (high GC g+) Corynebacterium glutamicum (high GC g+) Mycobacterium leprae Thermus aquaticus Thermotoga maritima gap-pgk-tpi gap-pgk-orf-tpi gap-pgk gap-pgk-tpi (o~-purple)Xanthobacter flavus (~-purple) Zymomonas mobilis (y-purple) Escherichia coil (y-purple) Haemophilus influenzae pgk gap-pgk gap-pgk-fba pgk-fba Methanobacterium bryanUi Methanothermus fervidus Haloarcula vallismortis pgk pgk gap-pgk r G e n e s 1 .~ A r c h Fig. 3. Neighbour joining tree for PGK protein sequences constructed from pairwise estimates of numbers of amino acid substitutions obtained assuming a gamma distribution for the variabiliy of substitution rate across positions. Genes found in eukaryotic genomes are designated with open branches, those found in eubacterial and archaebacterial genomes are designated with solid branches. Gene duplication events which resulted in subcellular recompartmentalization of enzymes are indicated with stars. Abbreviations are: TS, testis-specific; g +, gram-positive bacteria; GC, guanosine plus cytosine content; purple, purple bacteria; arch, archaebacteria. The scale bar at the upper left indicates 0.1 substitutions per site. PGK and TPI are fused in a single cotranslated protein in Thermotoga [44], as indicated by underlining. Numbers above nodes indicate the bootstrap proportion for 100 replicates using the same tree construction method, numbers below nodes (or in parenthesis) indicate the bootstrap proportion using protein parsimony of phylip. Bootstrap values lower than 50/100 are indicated as ' - ' . Sources of sequences are: Aspergillus nidulans P 11977, Bacillus megaterium P24269, Bacillus stearothermophilus P 18912, Candida maltosa P41757, Chlamydomonas 73 ply covering them with a piece of paper), the eubacterial portion of the P G K tree is also consistent in many aspects with traditional molecular systematic views of eubacterial evolution [ 50 ]: (1) common branching of purple bacterial sequences [45], (2) common branching of low-G + C grampositives [31], (3) common branching of high G + C-gram-positives [31 ], (4) separate branches for low and high G + C-gram-positives respectively [31], and (5) 'basal' positions within eubacteria for the thermophiles Thermus and Thermotoga. The 'problem' with the P G K tree is simply that P G K genes located in eukaryotic genomes are borne on eubacterial branches, and the plant sequences are distinct from remaining eukaryotic homologues. Two scenarios, each of which takes into account current theories on the origin of the eukaryotic nucleus can account for this finding. As to the first of these, we assume that the fusion hypothesis [12] is correct. It can account for the P G K gene tree topology observed, if (and only if) one makes the corollary assumptions that (1) the position of the archaebacterial 'root' is a treeing artefact, its true position lying on the 92/ 100 bootstrap branch bearing non-photosynthetic eukaryotic P G K sequences (which cannot be excluded), (2) eukaryotes lost the archaebacterial copy and retained only the eubacterial pgk gene subsequent to fusion, (3) the position of the plant sequences is explained as the result of subsequent acquisition via endosymbiotic gene transfer from eubacteria of a pgk gene which then underwent duplication to yield the plant isoenzymes, and (4) the preexisting gene for the plant homologue of cytosolic P G K from animals and fungi was simply replaced (or lost) subsequent to duplication. But what if the fusion hypothesis is not correct? As the second scenario, we assume that archaebacteria and eukaryotes are truly sister groups as suggested by several papers [12]. In that case, one can account for the P G K topology observed, if(and only if) the corollary assumptions are made that (1) all P G K sequences found in eukaryotic genomes are of eubacterial (endosymbiotic) origin, (2) the ancestral plant gene (which duplicated to yield the plant isoenzymes) is of eubacterial origin via endosymbiotic gene transfer from chloroplasts, and (3) the preexisting gene for the plant homologue of cytosolic P G K from animals and fungi was simply replaced (or lost) subsequent to duplication. In either case, the ancestral gene which gave rise via duplication to the plant isoenzymes are of eubacterial origin, and any preexisting P G K enzyme of the primitive chlorophyte cytosol must have been replaced (lost) subsequent to the duplication event. That would easily account for the finding that Chlamydomonas reinhardtii does not possess a cytosolic isoenzyme for P G K [43], rather only a chloroplast enzyme. Non-green photosynthetic protists (or Euglena) which branched early in eukaryotic evolution might have retained different type(s) of P G K genes. Plant P G K sequences are slightly more similar (with low bootstrap support) to P G K from low-G + Cgram-positive eubacteria (Bacillus) than to homologues from other eubacteria, the same is found for another nuclear-encoded chloroplast enzyme of eubacterial origin [20, 32]. No P G K sequence data from cyanobacteria are currently available. Under our working hypothesis of chloroplast origin for plant PGK, we would predict cyanobac- reinhardtii U14912, Corynebacterium glutamicum Q01655, Crithidiafasciculata pgkA glycosome(56PGK) P25055, Crithidiafasciculata pgkB cytosolP08966, Crithidiafasciculata pgkC glycosomeP08967, Drosophila melanogaster Q01604, Escherichia coli P11665, chicken L37101, Haemophilus influenzae U32734, human P00558, human testis-specific X05246, Kluveromyces lactis P14828, Methanobacterium bryantii P20972, Methanothermus fervidus P20971, mouse P09411, Mycobacterium leprae U00013, Neurospora crassa P38667, Penicilliurn citrinum P33161, Plasmodium falciparum P27362, Rhizopus niveus pgkl P29405, Rhizopus niveus pgk2 P29406, Saccharomyces cerevisiae P00560, Schistosoma mansoni L36833,Spinacia oleracea chloroplastP29409, Tetrahymena thermophila X63528, Thermotoga maritima X75437, Thermus aquaticus P09403, Trichoderma reesei P14228, Trichoderma viride P24590, Triticum aestivum chloroplast P12782, Triticum aestivum cytosol P12783, Trypanosoma brucei pgkA glycosome(56PGK) P08891, Trypanosoma brucei pgkB cytosolP07377, Trypanosoma brucei pgkC glycosomeP07378, Trypanosoma congolense pgkA glycosome (p56) P41762, Trypanosoma congolense pgkC cytosolP41760,Xanthobacterflavus U08462, Yarrowia lipolytica P29407, Zymomonas mobilis P09404. 74 terial pgk genes to be found which branch above Bacillus PGK but below the nuclear gene for Chlamydomonaschloroplast PGK in the pgk gene phylogeny. Acknowledgements We thank C. Schnarrenberger for valuable discussions. This work was supported by grants from the Deutsche Forschungsgemeinschaft. References 1. Alefounder PR, Perham SA: Identification, molecular cloning and sequence analysis of a gene cluster encoding the class II fructose 1,6-bisphosphate aldolase, 3-phosphoglycerate kinase and a putative second glyceraldehyde 3-phosphate dehydrogenase of Escherichia coli. Mol Microbiol 3:723-732 (1989). 2. Altekar W, Rajagopalan R: Ribulose bisphosphate carboxylase activity in halophilic archaebacteria. Arch Microb~ol 153:169-174 (1990). 3. Arcari P, Dello-Russo A, Ianniciello G, Gallo M, Bocchini V: Nucleotide sequence and molecular evolution of the gene coding for glyceraldehyde-3-phosphate dehydrogenase in the thermoacidophilic archaebacterium Sulfolobus solfataricus. Biochem Genet 31:241-251 (1993). 4. Benachenhou N, Baldacci G: The gene for a halophilic glutamate dehydrogenase: sequence, transcription analysis and phylogenetic implications. Mol Gen Genet 230: 345-352 (1991). 5. Brown JR, DoolittleWF: Root ofthe universal tree of life based on ancient aminoacyl-tRNA synthase gene duplications. Proc Natl Acad Sci USA 92:2441-2445 (1995). 6. Cendrin F, Chroboczek J, Zaccia G, Eisenberg H, Mevarech M: Cloning, sequencing and expression in Escherichia coli of the gene coding for malate dehydrogenase from the extremely halophilic archaebacterium Haloarcula marismortui. Biochemistry 23:4308-4313 (1993). 7. Chen J-H, Gibson JL, McCue LA, Tabita FR: Identification, expression, and deduced primary structure of transketolase and other enzymes encoded within the form II carbon dioxide fixation operon of Rhodobacter sphaeroides. J Biol Chem 266:20447-20452 (1991). 8. Danson MJ: Archaebacteria: the comparative enzymology of their central matabolic pathways. Adv Microbial Physiol 29:164-231 (1988). 9. Danson MJ: Central metabolism of the archaebacteria: an overview. Can J Microbiol 35:58-64 (1989). 10. Devereux J, Haeberli P, Smithies O: A comprehensive set of sequence analysis programs for the vax. Nucl Acids Res 12:387-395 (1984). 11. Dhar NM, Altekar W: Distribution of class I and class II fructose bisphosphate aldolases in halophilic archaebacteria. FEMS Microbiol Lett 35:177-181 (1986). 12. Doolittle RF: Of Archae and Eo: What's in a name? Proc Natl Acad Sci USA 92:2421-2423 (1995). 13. Fabry S, Heppner P, Dietmaier W, Hensel R: Cloning and sequencing the gene encoding 3-phosphoglycerate kinase from mesophilic Methanobacterium bryantii and thermophilic Methanothermusfervidus. Gene 91: 19-25 (1990). 14. Fothergill-Gilmore LA, Michels PAM: Evolution of glyeolysis. Progr Biophys Mol Biol 59:105-238 (1993). 15. Felsenstein J: PHYLIP: Phylogeny Inference Package (version 3.2). Cladistics 5:164-166 (1989). 16. Gogarten JP, Kibak H, Dittrich P, Taiz L, Bowman EJ, Bowman BJ, Manolson MF, Poole RJ, Date T, Oshima T, Konishi J, Denda K, Yoshida M: Evolution of the vacuolar H +-ATPase: Implications for the origin of eukaryotes. Proc Natl Acad Sci USA 86:6661-6665 (1989). 17. Gupta RS, Singh B: Cloning of the HSP70 gene from Halobacterium marismortui: relatedness of archaebacterial HSP70 to is eubacterial homologs and a model for the evolution of the HSP70 gene. J Bact 174:4594-4605 (1992). 18. Hannaert V, Michels P: Structure, function and biogenesis of glycosomes in Kinetoplastida. J Bioenerg Biomembr 20:205-212 (1994). 19. Hensel R, Zwickl P, Fabry S, Lang J, Palm P: Sequence comparison of glyceraldehyde-3-phosphate dehydrogenase from the three urkingdoms: evolutionary implication. Can J Microbiol 35:81-85 (1989). 20. Henze K, Badr A, Wettern M, Cerff R, Martin W: A nuclear gene of eubacterial origin in Euglena gracilis reflects cryptic endosymbioses during protist evolution. Proc Natl Acad Sci USA 92:9122-9126 (1995). 21. Henze K, Schnarrenberger C, Kellermann J, Martin W: Chloroplast and cytosolic triosephosphate isomerase from spinach: Purification, microsequencing and cDNA sequence of the chloroplast enzyme. Plant Mol Biol 26: 1961-1973 (1994). 22. Higgins DG, Sharp PM: CLUSTAL: A package for performing multiple sequence alignment on a microcomputer. Gene 73:237-244 (1988). 23. Iwabe N, Kuma K-I, Hasegawa M, Osawa S, Miyata T: Evolutionary relationship of archaebacteria, eubacteria and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc Natl Acad Sci USA 86:9355-9359 (1989). 24. Jin L, Nei M: Limitations of the evolutionary parsimony method of phylogenetic analysis. Mol Biol Evol 7: 82-102 (1990). 25. Joshi PB, Dennis PP: Characterization ofparalogous and orthologous members of the superoxide dismutase gene family from genera of the halophilic archaebacteria. J Bact 175:1561-1571 (1993). 26. KrOmer WJ, Arndt E: Halobacterial $9 operon: three ribosomal protein genes are cotranscribed with genes en- 75 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. coding a tRNA-Leu, the enolase, and a putative membrane protein in the archaebacterium Haloarcula (Halobacterium) marismortui. J Biol Chem 266:24573-24579 (1991). Kumada Y, Benson DR, Hillemann D, Hosted TJ, Rochefort DA, Thompson CJ, Wohlleben W, Tateno Y: Evolution of the glutamin synthase gene, one of the oldest existing and functioning genes. Proc Natl Acad Sci USA 90:3009-3013 (1993). Lam WL, Cohen A, Tsouluhas D, Doolittle WF: Genes for tryptophan biosynthesis in the halophile archaebacterium Haloferax volcanff: the trpDFEG cluster. Proc Natl Acad Sci USA 87:6614-6618 (1990). Lam WL, Logan SM, Doolittle WF: Genes for tryptophan biosynthesis in the halophile archaebacterium Haloferax volcanii: the trpDFEG cluster. J Bact 174: 16941697 (1992). Longstaff M, Raines CA, McMorrow EM, Bradbeer JW, Dyer T: Wheat phosphoglycerate kinase: Evidence for recombination between the genes for the chloroplastic and cytosolic enzymes. Nucl Acids Res 17:6569-6580 (1989). Ludwig W, Kirchhof G, Klugbauer N, Weizenegger N, Betzl D, Ehrmann M, Hertel C, Jilg S, Tatzel R, Zitelsberger H, Liebl S, Hochberger M, Shah J, Lane D, WallnOferP, Scheifer KH: Complete 23S ribosomal RNA sequences of Gram positive bacteria with a tow G + C content. System Appl Microbiol 15:487-501 (1992). Martin W, Brinkmann H, Savona C, Cerff R: Evidence for a chimaeric nature of nuclear genomes: eubacterial origin of eukaryotic glyceraldehyde-3-phosphate dehydrogenase genes. Proc Natl Acad Sci USA 90:8692-8696 (1993). Morse D, Salois P, Markovic P Hastings JW: A nuclearencoded form II RuBisCO in dinoflagellates. Science 268: 1622-1624 (1995). Osinga KA, Swinkels BW, Gibson WC, Borst P, Veeneman GH, van Boom JH, Michels PAM, Opperdoes F: Topogenesis of microbody enzymes: a sequence comparison of the genes for the glycosomal (microbody) and cytosolic phosphoglycerate kinases of Trypanosoma brucei. EMBO J 4:3811-3817 (1985). Parker HL, Hill T, Alexander K, Murphy NB, Fish WR, Parsons M: Three genes and two isoenzymes: Gene conversion and the compartmentalization and expression of the phosphoglycerate kinases of Trypanosoma (Nannomohas) congolense. Mol Biochem Parasitol 69:269-279 (1995). Pelzer-Reith B, Penger A, Schnarrenberger C: Plant aldolase: cDNA and deduced amino-acid sequence of the chloroplast and cytosol enzyme from spinach. Plant Mol Biol 21:331-340 (1993). Pr~lss B, Meyer HE, Holldorf AW: Characterization of the glyceraldehyde-3-phosphate dehydrogenase from the extremely halophilic archaebacterium Haloarcula vallismortis. Arch Microbiol 160:5-I1 (1993). 38. Rawal H, Kelkar SM, Altekar W: Ribulose-l,5-bisphosphate dependent CO2 fixation in the halophilic archaebacterium Halobacterium mediterranei. Biochem Biophys Res Comm 156:451-456 (1988). 39. Saitou N, Nei M: The neighbor-joining method: a new method for the reconstruction of phylogenetic trees. Mol Biol Evol 4:406-425 (1987). 40. Sambrook J, Fritsch EF, Maniatis T: Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989). 41. Schmidt M, Dvendsen I, Feierabend J: Analysis of the primary structure of the chloroplast isozyme of triosephosphate isomerase from rye leaves by protein and cDNA sequencing indicates a eukaryotic origin of its gene. Biochim Biophys Acta 1261:257-264 (1995). 42. Schnarrenberger C, Flechner A, Martin W: Enzymatic evidence indicating a complete oxidative pentose phosphate in the chloroplasts and an incomplete pathway in the cytosol of spinach leaves. Plant Physiol 108:609-614 (1995). 43. Schnarrenberger C, Jacobshagen S, MOiler B, Krtlger I: Evolution of isozymes of sugar phosphate metabolism in green algae. In: Ogita S, Scandalios JS (eds) Isozymes: Structure, Function, and Use in Biology and Medicine, pp. 743-7460. Wiley-Liss, New York (1990). 44. Schurig H, Beaucamp N, Ostendorp R, Jaenicke R, Adler E, Knowles JR: Pbosphoglycerate kinase and triosephosphate isomerase from the hyperthermophilie bacterium Thermotoga maritima form a covalent bifunctional enzyme complex. EMBO J 14:442-451 (1995). 45. Stackebrandt E, Murray RGE, Trtlper HG: Proteobacten'a classis nov., a name for the phylogenetic taxon that includes the 'purple bacteria and their relatives'. Int J Syst Bact 38:321-325 (1988). 46. Swinkels BW, Evers R, Borst P: The topogenic signal of the glycosomal (microbody) phosphoglycerate kinase of Crithida fasiculata resides in a carboxy-terminal extension. EMBO J 7:1159-1165 (1988). 47. Tabita RF: Carbon dioxide fixation and its regulation in cyanobacteria. In: Fay P, van Baalen C (eds) The Cyanobacteria, pp. 95-118 Elsevier, Amsterdam (1987). 48. Vohra GB, Golding GB, Tsao N, Pearlman RE: A phylogenetic analysis based on the gene encoding phosphoglycerate kinase. J Mol Evol 34 383-395 (1992). 49. Weeden NF: Genetic and biochemical implications of the endosymbiotic origin of the chloroplast. J Mol Evol 17: 133-139 (1981). 50. Woese CR: Bacterial evolution. Microbiol Rev 51: 221271 (1987). 51. Zillig W, Klenk H-P, Palm P, Leffers H, Ptlhler G, Gropp F, Garrett RA: Did eukaryotes originate by a fusion event? Endocyt C Res 6:1-25 (1989). 52. Zillig W: Comparative biochemistry of Archaea and Bacteria. Curr Opin Genet Devel 1:544-550 (1991).
© Copyright 2026 Paperzz