Higher-plant chloroplast and cytosolic 3

Plant Molecular Biology 30: 65-75, 1996.
© 1996 Kluwer Academic Publishers. Printed in Belgium.
65
Higher-plant chloroplast and cytosolic 3-phosphoglycerate kinases:
a case of endosymbiotic gene replacement
Henner Brinkmann 1 and William Martin 2"*
l Institutfiir Botanik, Technische Universitgit Braunschweig, Spielmannstrasse 7, D-38023 Braunschweig,
Germany; 2Institut ffir Genetik, Technische Universitiit Braunschweig, Spielmannstrasse 7, D-38023
Braunschweig, Germany (* author for correspondence)
Received 10 July 1995; accepted in revised form 25 September 1995
Key words: Archaebacteria, endosymbiosis, molecular evolution, isoenzymes, carbohydrate
metabolism
Abstract
Previous studies indicated that plant nuclear genes for chloroplast and cytosolic isoenzymes of
3-phosphoglycerate kinase (PGK) arose through recombination between a preexisting gene of the eukaryotic host nucleus for the cytosolic enzyme and an endosymbiont-derived gene for the chloroplast
enzyme. We readdressed the evolution of eukaryotic pgk genes through isolation and characterisation
of a pgk gene from the extreme halophilic, photosynthetic archaebacterium Haloarcula vallismortis and
analysis of PGK sequences from the three urkingdoms. A very high calculated net negative charge of
63 for PGK from H. vallismortis was found which is suggested to result from selection for enzyme
solubility in this extremely halophilic cytosol. We refute the recombination hypothesis proposed for the
origin of plant PGK isoenzymes. The data indicate that the ancestral gene from which contemporary
homologues for the Calvin cycle/glycolyticisoenzymes in higher plants derive was acquired by the nucleus
from (endosymbiotic) eubacteria. Gene duplication subsequent to separation of Chlamydomonas and land
plant lineages gave rise to the contemporary genes for chloroplast and cytosolic PGK isoenzymes in
higher plants, and resulted in replacement of the preexisting gene for PGK of the eukaryotic cytosol.
Evidence suggesting a eubacterial origin of plant genes for PGK via endosymbiotic gene replacement
indicates that plant nuclear genomes are more highly chimaeric, i.e. contain more genes of eubacterial
origin, than is generally assumed.
Abbreviations: PGK, 3-phosphoglycerate kinase; FBA, fructose-l,6-bisphosphate aldolase; GAPDH,
glyceraldehyde-3-phosphate dehydrogenase; TPI, triosephosphate isomerase.
Introduction
The evolutionary origin of chloroplast/cytosol enzymes in plants traditionally has been viewed in
light of a working hypothesis known as the gene
transfer corollary to endosymbiotic theory [49],
which predicts that the genes for cytosolic isoenzymes were endogeneous to the nuclear genome
The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under
the accession number L47295.
66
at the time of chloroplast acquisition and that, in
the process of chloroplast genome reduction,
genes for chloroplast isoenzymes were transferred
to the nucleus where they acquired a transit peptide for reimport into the organelle of their genetic
origin. Calvin cycle and glycolysis possess a number of parallel enzymatic reactions in cytosol and
stroma which are catalysed by distinct isoenzymes specific to each compartment, whereas homologues of the Calvin cycle and the oxidative
pentose phosphate pathway appear only to occur
in chloroplasts [42]. Molecular sequences for
these genes can help to shed light on the evolutionary mechanisms which led to the biochemical
partitioning now observed in contemporary photosynthetic eukaryotes. Recent molecular studies
of chloroplast/cytosol isoenzymes of Calvin cycle
and glycolytic pathways revealed, perhaps surprisingly, that the gene transfer corollary has not
been substantiated for any isoenzyme pair studied in plants to date. It was gene duplication, not
endosymbiotic gene transfer, which provided the
origin of the chloroplast/cytosol isoenzymes for
fructose-l,6-bisphosphate aldolase (FBA, duplication in early eukaryotic evolution) [36], triosephosphate isomerase (TPI, duplication in chlorophyte evolution) [21, 41 ], and glyceraldehyde-3phosphate dehydrogenase (GAPDH, duplication
in eubacterial evolution) [32, 20].
In the initial report of molecular sequences for
chloroplast and cytosolic 3-phosphoglycerate kinase from plants (PGK) [30] it was noted that
the sequences are more similar to one another
than one would expect under the gene transfer
corollary. It was suggested that recombination
had occurred between the respective isoenzyme
genes in an attempt to explain the lack of expected similarity between plant and other eukaryotic cytosolic PGK. But, due to the lack at that
time of sufficient reference sequences from
prokaryotic and eukaryotic sources, an explicit
hypothesis was not put forth for the origin of
either gene in the context of either eukaryotic or
eubacterial evolution. In a later study of PGK
gene evolution, Vohra et al. [48] interpreted the
phylogenetic results obtained as statistical support for the recombination model.
Sequences from archaebacteria are neccessary
for comparison to eubacterial and eukaryotic homologues in order to distinguish between different possibilities of PGK gene origin. Two PGK
sequences from methanogenic archaebacteria
exist in the database [ 13 ], but methanogens might
not be representative of all archaebacteria concerning PGK evolution. Methanogens use PGK
in hexose biosynthesis rather than energy metabolism [8, 9], whereas halophilic archaebacteria
possess rhodopsin-based photosynthetic membranes and have been reported to possess
ribulose-1,5-bisphosphate carboxylase activity [2,
38], a key Calvin cycle enzyme which methanogens lack altogether. It is thus conceivable that
aspects of sugar phosphate metabolism in halophiles are more similar to that in photosynthetic
eubacteria and eukaryotes, making them important links for understanding the evolution of genes
for enzymes of sugar phosphate metabolism in
general. Here we describe the cloning and structure of the PGK gene from the photosynthetic
extreme halophile Haloarcula vallismortis and reevaluate PGK gene evolution as it relates to the
origin of genes for plant chloroplast and cytosol
PGK.
Material and methods
Culture conditions and nucleic acid isolation
Haloarcula vallismortis type strain 3756 (ATCC
29715) was obtained from the Deutsche Sammlung ftlr Mikroorganismen (Braunschweig) and
grown under shaking at 43 °C in the medium
described [37]. Cells were grown to a n OD6o 0 of
3.0, harvested by centrifugation and lysed by suspension in 50 mM Tris-HC1, 50 mM EDTA
pH 8.0. The lysate was purified by phenol/
chloroform extraction, nucleic acids were recovered by ethanol precipitation, DNA was purified
by CsC1 centrifugation in a Ti70 rotor.
67
Isolation
of
a
pgk gene from
Haloarcula
vallismortis
A library was constructed in 2EMBL4 via
Sau 3AI partial digestion and purification of
18-22 kb fragments in saccharose gradients, ligation and packaging as described [40]. Recombinants were plated on Escherichia coli K803 and
screened by plaque hybridisation at 32 °C in hybridization buffer (6 × SSPE, 0.1 ~o SDS, 0.02~o
PVP, 0.02~o Ficoll-400) with 10 ng/ml (5 × 10 6
cpm/ml) of an end-labelled 16-fold degenerate 16mer constructed against the conserved amino acid
motif WYDNEY found in peptides sequenced
from the H. vallismortis GAPDH enzyme [37].
Positives were purified from which the hybridizing and overlapping 3.0 kb Bss HII, 3.3 kb Pst I
and 2.4 kb Sac II fragments were subcloned into
pBluescript vectors (Stratagene). The region encompassing the gap and pgk genes was sequenced
on both strands via Exo III deletion series and
synthetic oligonucleotide primers using the
dideoxy method. Other molecular methods were
performed as described [40].
Sequence analysis
Standard sequence analysis was performed with
the GCG Package [10]. Sources of sequences
used for analysis are given in the legend to Fig. 3.
Sequences were aligned with clustal V [22], the
alignment was refined by eye with lineup. After
exclusion of positions not occupied by an amino
acid in all sequences, 308 positions remained in
the final alignment which was used for phylogenetic inference. Distance between sequences was
measured as numbers of amino acid substitutions
per site corrected for multiple substitutions by
assuming a gamma distribution for the variability
of sub stitution rate acro s s positions [24 ]. For this,
a neighbour-joining tree [39] was constructed
using the proportion of amino acid differences
between PGK sequences, from which the gamma
parameter (a) was estimated. The value of a thus
determined (2.28) was used to estimate numbers
of substitutions per site between sequences using
the gamma correction. The resulting distance matrix yielded the final neighbour joining tree (calculated with a program kindly provided by M.
Nei). The reliability of branches was estimated by
bootstrapping (100 replicates) using the same
gamma parameter and by bootstrap parsimony
analysis using protpars of phylip [15].
Results and discussion
H. Vallismortis possesses a gap-pgk gene cluster
Ten phages which positively hydridized to the
synthetic oligonucleotide probe were purified and
shown by restriction mapping to overlap in the
region encompassing the gap gene. Several overlapping fragments spanning the region were subcloned and sequenced. Due to the very high GC
content ofH. vallismortis DNA, a large number of
sequencing runs on both strands in addition to
the use of deaza analogues were necessary before
all bases had been determined unambiguously.
The gap gene sequence (GenBank accession
number L47295) was found to contain both of the
oligopeptide sequences reported by Pr(iss et al.
[37] with no discrepancies. The deduced amino
acid sequence of the Haloarcula gap gene product
shares roughly 50~o identical residues with
GAPDH proteins from eubacteria and eukaryotes, but like eubacterial GAPDH, it shares only
ca. 15~o identical residues with sequences from
methanogenic archaebacteria [19] suggesting
that, in analogy to class I and class II aldolases
[36], type I and type II glutamine syntheases [27],
or form I and form II Rubisco [7, 33, 47], archaebacteria may possess 'class I' and 'class II'
GAPDH (manuscript in preparation).
Downstream of the gap gene we found a 1203
bp open reading frame, the predicted product of
which (Fig. 1) shows ~ 3 7 ~ identity to PGK
from methanogenic archaebacteria and < 30~o
identity to PGK from eukaryotes and eubacteria.
Between the stop codon of the gap gene and the
start codon of the pgk gene there is a 296 bp
stretch which contains several terminator-like
hairpin structures and palindromic sequences, the
68
.
.
.
.
.
T~GCGC~AAACGGTTTGGGACGCCAGTAAAGTTTAT~TTATTATTCGCT~G~TAG
60
.
.
.
.
.
120
G~ACACATAGAGTCCG~CCAGCACCTGACCCCTGTTCGGCTCTGTGTG~GACACAGG
.
.
.
.
.
180
CCCGTTTTCCCAGCCGGCGTGCTGTGTCCCGCGCCGGCTGTGTACGGCGCAGGTGTCCCC
>>>>>>>>>>>>>
<<<<<<<<<<<<<
240
ACCACTCTCCGACCCCATT~TTACGTTGGCAGC~GGCAGACCGCAGTGTTACCCCCAG
300
CAGGTCGCTCGCTGTTCGGACTC~AAACCGTT~GCGGGCGGCAGGTTTCCTCTCG~T
M
360
GATGACTTTCCAGACGCTCGATGACCTCGACGATGGACAGCGCGTCCTCGTCCGACTCGA
M T F Q T L D D L D D G Q R V L V R L D
420
CCTC~CTCACCGGTCGAGGACGGCACAGTACAGGAC~TCGACGGTTCGACCGCCACGC
L N S P V E D G T V Q D N R R F D R H A
480
GGAGACTGTC~GG~CTCGCTGACC~GGGTTCGAGGTC~AGTGCTGGCCCATCAGGG
E T V K E L A D R G F E V A V L A H Q G
540
CCGCCCGGGCCGCGACGACTTCGTCTCGCTCGACCAGCACGCCGACATCCTCGCCGACCA
R P G R D D F V S L D Q H A D I L A D H
600
CATCGACCGCGACGTGGATTTCGTCGACGAGACCTACGGGCCACAG~GATTCACGATAT
I D R D V D F V D E T Y G P Q A I H D I
660
C~CGACCTCGATAGTGGCGACGTGCTCGTTCTGGAG~CACGCGGATGTGTGACGACGA
A D L D S G D V L V L E N T R M C D D E
720
ACTGCCCGAGG~GACCCCG~GTG~CCCAGACGGAGTTCGTC~CGCTGGCCGG
L P E E D P E V K A Q T E F V K T L A G
780
GGAGTTCGACGCCTACATC~CGACGCCTACTCCGCG~CCACC~TCGCACGCCTCGCT
E F D A Y I N D A Y S A A H R S H A S L
840
CGTGGGCTTCCCGCTCGTGAT~ATGCCTACGCC~TCGCGTGAT~AGACC~GTACGA
V G F P L V M D A Y A G R V M E T E Y E
900
GGCC~CACCGCCATCGCCGAG~GGAGTTCGACGGGCAGGTGACGATGGTCGTCGGCGG
A N T A I A E K E F D G Q V T M V V G G
960
GACG~GGCCACCGACGTCATCGACGTGATGACCCACTTAGAC~G~GGTCGACGACTT
T K A T D V I D V M T H L D E K V D D F
1020
CTTGCTTGGCGGTATCGCG~CTGTTCCTGCGGCAGCC~CCACCCAGTCGGCTACGA
L L G G I A G T V P A A A G H P V G Y D
1080
CATCGACGACGCGAACCTCTACGACGAGCAGTGGGAGGCAAACAGTGAG~GATC~GTC
I D D A N L Y D E Q W E A N S E K I E S
1140
CAT~TCG~GACCACCGG~CCA~TTACCCTCGCGGTC~CCTG~CTAC~ACGA
M L E D H R D Q I T L A V D L A Y E D E
1200
G~CGACGACCGCGCCGAGCAGGCCGTCGACGACATCGACGAGAAAAGGCTCTCGTACCT
N D D R A E Q A V D D I D E K R L S Y L
most prominent of which is indicated in the figure. Upstream of the palindrome is an AT-rich
region which stands out due to the high GC content of the region sequenced (60 ~o overall, 85 ~o
at third codon positions). A TATA-like motif is
present 25 bp upstream of the pgk start codon.
Some gene clusters, such as the tryptophan
operon, are similarly organized in archaebacteria
and eubacteria [28, 29]. A gap-pgk gene cluster is
found in both archaebacterial and eubacterial genomes (Fig. 3), which suggests that this could represent a gene organisation which might have
been present in the progenote. The gene cluster
gap-pgk-tpi has been found in gram positive
eubacteria and Thermotoga maritima, although in
Thermotoga PGK and TPI are fused to a single
bifunctiona170 kDa protein [44]. In two 7-purple
bacteria, the fba gene for fructose-l,6-bisphosphate aldolase (Class II) [ 1] instead oftpi is found
downstream ofpgk (Fig. 3). Although in E. coli
the gap-pgk association is preserved, in the Haemophilus genome the pgk-fba cluster (accession
number U32734) is separated from gap and tpi by
over 500 and 100 kb, respectively, indicating that
'reshuffling' of these genes occurs at least occasionally in bacterial evolution. In accordance with
that view is the finding that pgk in the thermoacidophilic archaebacterium Sulfolobus (not mentioned in [3]) is located next to a gap gene, but in
pgk-gap order. Furthermore, the Sulfolobus gap
gene encodes a Class II G A P D H enzyme
(methanogen-like, see above), suggesting convergent pgk gene cluster organization in this thermoacidophile.
1260
CGACGT~GCAGCGAGACCCTCATGGAGTACTCGCCGATCATCCGCGAGTCCGAG~CGT
D V G S E T L M E Y S P I I R E S E A V
1320
CTTCGGTG~G~CGCGCTGGGATGTTCGAAGACG~CGGTTCTC~TCGGGACTGCCGG
F G E G R A G M F E D E R F S V G T A G
1380
TGTGCTCG~GCCATCGCCGACACCGACTGTTTCTCCGTCGTCGGCGGCGGCGACACCTC
V L E A I A D T D C F S V V G G G D T S
High net charge of PGK from an extreme halophile
H. vallismortis grows well at 43 °C in 4-5 M
NaCI, in our hands a doubling time of about 4 h
1440
AC~GCTATCGAGATGTAC~GATGG~G~GAC~GTTCGGCCACGTCTCCATCGCTGG
R A I E M Y G M E E D E F G H V S I A G
1500
CGGGGCCTACATCCG~CGTTGACGCGCGCAC~CTGGTCGGCGTCG~GTCCTCCAGCG
G A Y I R A L T R A Q L V G V E V L Q R
Fig. 1. Sequence of the Haloarcula vallismortispgk gene in the
gap-pgk cluster. An AT-rich region (underlined) and a con-
1560
CT~CTGCGGCCTTTCGTCTCGGCGTGTCGCAGGTCCCAGTCGTTGCGTTCGTCCGGG~
spicuous palindrome (>>> <<< ) are indicated. The putative
TATA box is double underlined, the stop codon of the gap
gene is indicated with three asterisks, that of the pgk gene with
one asterisk.
1620
CGTCGACG~CCGCCGCGTCTCGGTCGACTCCGTCCGTGCGTGGCGGGCAGTCCAGTCGT
GGTCAGCGTCGGGCGTGGCTGCCATACACTGCGTTACA
69
was observed. Enzymes of sugar phosphate
metabolism from extreme halophilic archaebacteria require salt concentrations of 2 to 3 M in
order to achieve full enzyme activity [2, 6, 11, 37],
a concentration which would be sufficient to precipitate many proteins from non-halophilic organisms. Furthermore, intracellular salt concentrations in extreme halophiles are typically at least
as high as those of the environment (ca. 3.5 M)
[9], it is therefore perhaps not surprising that the
deduced amino acid sequence of H. vallismortis
P G K has 93 aspartate plus glutamate residues
(D + E) in addition to 41 histidine, lysine plus
arginine (H + K + R) residues for total of 134 out
of 401 (33~o) highly charged residues and an
overall predicted charge of -63 from the simple
primary sequence. By comparison, E. coli P G K
has an overall charge of -11 (55 D + E and
48 H + K + R), human P G K a charge of + 3
(50 D + E and 58 H + K + R), and P G K from the
non-halophilic archaebacterium Methanothermus
fervidus a charge of +5 (55 D + E and 68
H + K + R). The highly negative charge ofH. vallismortis P G K is, however, comparable to that
previously reported for various cytoplasmic enzymes from extreme halophiles: superoxide dismutase, -32 [25]; NADP-specific glutamate dehydrogenase, -40 [4]; dnaK, - 102 [ 17 ]; enolase,
-53 [26]. This charge is thus not the product of
chance, but rather a common characteristic of
cytoplasmic proteins from halophiles which likely
serves to keep them in solution at high to saturating cellular salt concentrations. An alignment
of phosphoglycerate kinase sequences from eukaryotic, eubacterial and archaebacterial genomes (Fig. 2) indicates that the distribution of
charge in the H. vallismortis sequence is rather
uniform, revealing no obvious concentration of
either positively or negatively charged residues in
any portion of the protein primary structure.
Plant chloroplast/cytosol PGK isoenzymes: a gene
duplication
Based upon an expanded version of the alignment
in Fig. 2, phylogenetic analysis of P G K sequences
including chloroplast and cytosolic isoenzymes
from plants provide an intruiging picture of gene
evolution (Fig. 3). Both chloroplast and cytosolic
P G K sequences from plants assume a very distinct position in the gene tree relative to pgk genes
of non-photosynthetic eukaryotes. The cytosolic
enzyme does not branch with its homologues from
higher eukaryotes and very importantly, genes for
both plant isoenzymes share a common branch
which is located well below pgk genes from the
kinetoplastid protists Trypanosoma and Crithidia
with a quite high bootstrap percentage (94)
(Fig. 3). This finding quite clearly indicates that
neither genes for the cytosolic nor the chloroplast
enzyme in plants surveyed (only chlorophytes)
are orthologues of the cytosolic enzyme in nonphotosynthetic eukaryotes, since orthologues
would be expected to branch well above kinetoplastid genes, close to animals and fungi. This
unexpected pattern of similarity was perceived by
Longstaff et al. [30], who put forth a 'recombination' hypothesis (as one of several alternatives)
to explain the lack of expected similarity between
wheat cytosolic P G K and its eukaryotic homologues. Under that model, the gene for the plant
cytosolic enzyme is orthologous to its counterpert
in animals and fungi whereas the gene for the
chloroplast enzyme derives from the cyanobacteria-like antecedents of plastids, and subsequent
recombination was suggested to have occurred
between them resulting in genes with intermediate
similarity to eubacterial and eukaryotic homologues.
The recombination hypothesis was thus envoked under Weeden's [49] gene transfer corollary as a subordinate scenario to explain aspects
of the data and was accepted in later studies of
P G K evolution [14, 48]. However, recombination does not influence divergence between recornbining genes and therefore cannot account for the
finding that the plant isoenzymes are more similar (82~o identity) to each other than to any
eukaryotic or eubacterial homologue. The by far
most straightforward interpretation of the data is
that genes for the chloroplast and cytosolic isoenzymes ofphosphoglycerate kinase in higher plants
arose through gene duplication in eukaryotic ge-
70
1
90
....
+
+
.
.
.
.
++
-++
-
+-
-+
-
+
+
+--
Haloarcula
Methanobacterium
Methanothermus
MMTFQTLDD.LDDGQRVLVRLDLNSPVEDGT...VQDNRRFDRHAETVKELADRGFEVAVL.AHQGRPGRDDFV
................
MSLPFYTIDDF'NLEDKTVLVRVDINSPVDPST.GSILDDTKIKLHAETIDEISKKGAKTVVL.AHQSRPGKKDFT
................
MFKFYTMDDF.DYSGSRVLVRVDINSPVDPHT.GRILDDTRMRLHSKTLKELVDENAKVAIL.AHQSRPGKRDFT
................
Human
SLSNKLTLDKLDVXGKRVVMRVDFNVPMKNNQ...ITNNQRIKAAVPSIKFCLDNGAKSVVLMSHLGRPDGVPMPD
...............
Aspergl I Ius
MSLTSKLSITDVDLKDKRVLIRVDFNVPLDKNDNTTITNPQRIVGALpTIKYAIDNGAKAVILMSHLGRpDGKKNp
................
Saccharomyces
SLSSKLSVQDLDLKDKRVFIRVDFNVPLDGKX.--ITSNQRIVAALPTIKIrVLEHHpRYVVLASHLGRPNG.ERNE
...............
T. brucei c y t
MSLKERKSINECDLKGKKVLIRVDFNVPLDDGN...ITNDYRIRSALPAVQKVLTEGGS.CVI~MSHLGRPKGVSMAEGKELRSTGGIPGFEQ
MTLNEKKSINECDLKGKKvLIRVDFNVP9q4~GK...ITNDYRIRSALPTLKKVLTEGGS.CVLMSHLGRPKGIPMAQAGKIRSTGGVPGFQQ
T. brucei g l y c
MNKKTLKDIDVKGKRVFCRVDFNVPMKDGK.--VTDETRIRAAIPTIQYLVEQGAK.VILASHLGRPK-GEVVE
...............
Bacillus meg.
...............
Corynebacterium M A V K T L K D L L D E G V D G R H V I V R S D F N V P L N D D R . . E I T D K G R I I A S L P T L K A L S E G G A K . V I V M A H L G R . . Q G E V N E
MSVIKMTDLDLAGKRVFIRADLNVPVKDGK.--VTSDARIRASLPTIELALKQGAK-VMVTSHLGRPTEGEYNE
...............
Escherichia
MRTLLDLDPKGKRVLVRVDYNVPVQDGK--.VQDETRILESLPTLRHLLAGGAS.LVLLSHLGRPKGPDP
.................
Thermus
Zymomonas
MAFRTLDDIGDVKGKRVLVREDLNVPMDGDR..-VTDDTRLRAAIPTVNELAEKGAK.VLILAHFGRPKGQPNP
................
Triticum
MATKRSVGTLGEADLKGKKVFVRADLNVPLDDAQ-.KITDDTRIRASIPTIKYLLEKGAK.VILASHLGRPKGVTP
.................
cyt
chl <<MAKKSVGDLTSADLKGKKVFVRADLNVPLDDSQ..NITDDTRIRAAIPTIKHLINNGAK.VILSSHLGRPKGVTP
.................
Spinacia
chl <<MAKKSVGDLTAADLEGKRVLVRADLNVPLDDNQ..NITDDTRIRAAIPTIKYLLSNGAK-VILTSHLGRPKGVTP
.................
Trl ti cum
* * * *
*
,
*
180
-
Haloarcula
Me than oba c teri um
Methanothermus
Human
Aspergillus
Saccharomyces
T. brucei c y t
T. brucei g l y c
Bacillus meg.
Coryn ebac teri um
Escherichia
Thermus
Zymomonas
Triticum
cyt
Spinacia
chl
Triticum
chl
+
-+
-
-+-
+.
.
.
.
++
Haloarcula
Methanobacterium
Methanothermus
Human
Aspergillus
Saccharomyces
T. brucei c y t
T. brucei g l y c
Bacillus meg.
Cory~ebacterium
Escherichia
Thermus
Zymomonas
Triticum
cyt
Spinacia
chl
Triticum
chl
.
.
+
.
.
.
.
.
.
.
+
-
+
-
_
_
•S L D Q H A D I L A D H I D R D V D F V D E T Y G P Q A I H D I A D L D S G D V L V L E N T R M C D D E L
........ PEEDPEVKAQTEFVKTLAGEFDAYIND
• •T L Q Q H A K A L S N I L N R P V D Y I D D I F G T A A R E E I K R L K K G D I L L L E N V R F Y P E E I
........ LKRDPHQQAETHMVRKLYPI
IDIFIND
• -T M E E H S K I r L S N I L D M P V T Y V E D I F G C A A R E S I R N M E N G D I
ILLENVRFYSEEV
........ LKRDPKVQAETHLVRKLSSVVDYYIND
KYS LE PVAVELKSLLGKDVLFLKDCVGPEVEKACANpAAG
SV ILLENLRFHVEEEGKGKDASGNKVKAE
PAKI EAFRAS L S KLGDVYVND
KYS LK PVVPKLKELLGRDV
IFTEDCVGPEVEETVNKASGGQVI
LLI~LRFHAEEEG
S SKDADGNKVKADKDAVAQFRKGLTALGDI
T IND
KYS LAPVAKELQ SLLGKDVTFLNDCVGPEg-EAAVKASAPGSVI
L L E N L R Y H I E E E G S R K V -D G Q ~ S K E D V Q K F R H E L S
S LADVY IND
K A T L K P V A K A L S E L L S R P V T F A P D C L - •N A A D W S
KM S PGDVVLLENVRFYKEE
........ GSKSTEEREAMAKIAKAFAELADVYVND
K A T L K P V A K A L S E L L L R P V T F A P D C L . •N A A D W S
KMS PGDVVLLENVRFYKEE
........ GSKKAKDREAMAKI
• • -L A S Y G D ~ F / I S D
ELRLNAVAERLQALLGKDVAKADEAFGEEVKKT
I DGMS EGDVLVLENVRFYPGEE
........... KNDPEL ....... LASYGDVYISD
KYS LAPVAEAL S DELGQYVALAADVVGEDAHERANGL
T EGD I LLLENVRFDpRET
S ........ KDEAERNRFAQELAALAADNGAFVS
D
EF S LL PVVNYLKDKL SNPVRLVKDYL
...... DGVDVAEGELVVLENVRFN
...............
KGEKKDDETL S KKYAALC DVFVMD
KYS LAPVGEALRAHL
PEARFAPFP PGS EEARREAEALRPGEVLLLENVRFEPGEE
........... KNDPE .... LSARYARLGEAFVLD
EMS LAR IKDALAGVLGRPVHF
IND IKGEAAAKAVDALNPGAVALLENTRFYAGEE
........... KNDPA .... LAAEVAKLGDFYVND
KF S LKPLVARL SELLGLEVVMAPDC
IGEEVEKLAAAL PDGGVLLLENVRFYKEEE
........... ~NDPE .... FAKKLASVADLYVND
KFSLAPLVPRLSELLGLQVVKADDC
IG P D V E K L V A E L P E G G V L L L E N V R F Y K E E E
........... KNDpE .... FAKKLASLADLYVND
KF S LAPLVPRL S ELLG IEVKKAEDVI G PEVEKLVADLANGAVLL
LENVRFYKEEE
........... KNDPE .... FAKKLASLADLFVND
•
•
+
-
+
.
.
.
.
***
+-
•
•
-
+
-
-
+ --+
270
+
__
AYSAAHRSHASLVGFPLvM.DAYAGRVMETEYEANTAIAEKEFDGQVTMVvGGTKATDVIDvMTHLDEKvDDFLLGGIAGTvPAAAGHPV
AFAAAHRSQPSLVGFAVKL.PSGAGRIMEKELKSLYGAVDNA.EKP~DSIMVLENVLRN.GSADYVLTTGL~ANIFLWAS
AFAAAHRSQPSLVGFPLKL.PSAAGRLMEREVKTLYKIIKNV.EKPCVYILGGVKIDDSIMIMKNILKN.GSADYILTSGLVANVFLEAS
AFGTAHRAHSsMVGVNLPQ..KAGGFLMKKELNYFAKALESP.ERPFLAILGGAKVADKIQLINNMLDK...VNEMIIGGGMAFTFLKVL
AFGTAHRAHSSMVGVDLPQ..KASGFLVKKELEYFAKALEEP.QRPFLAILGGSKVSDKIQLIDNLLPK...VNSLIITGGMAFTFKKTL
AFGTAHRAHsSMVGFDLPQ..RAAGFLLEKELKYFGKA~ENP.TRPFLAILGGAKVADKIQLIDNLLDK...VDSIIIGGGMAFTFKKVL
AFGTAHRDSATMTGIPKILGHGAAGYLMEKEISYFAKVLGNP.PRPLVAIVGGAKVSDKIQLLDNMLQR..•IDYLLIGGAMAYTFLKAQ
AFGTAHRDSATMTGIPKILGNGAAGYLMEKEISYFAKVLGNP.PRPFTAIVGGAKVF~DKIGVIDHLLDK...VDNLIIGGGLSYTFIKAL
AFGAAHRAHASTEGIAQHI.PAVAGFLMEKELDVLSKALSNP.ERPLVAIVGGAKVSDKIQLLI~WMLQR...IDYLLIGGAMAYTFLKAQ
GFGVVHRAQTSVYDIAKLL.PHYAGGLVETEISVLEKIAESP.EAPYVVVLGGSKVSDKIGVIEALAAK...ADKIIVGGG~CYTFLAAQ
AFGTAHRAQASTHGIGKFADVACAGPLLAAELDALGKALKEP.ARPMVAIVGGSKVSTKLTVLDSLSKI...ADQLIVGGGIANTFIAAQ
AFGSAHRAHASVVGVARLL.PAYAGFLMEKEVRALSRLL••DP.ERPYAVVLGGAKVSDKIGVIESLLPR...IDRLLIGGAMAFTFLKAL
AFSAAHRAHVSTEGLAHKL.PAFAGRAMQKELEALEAALGKP.THPVAAwGGAKVSTKLDVLTNLVSK...VDHLIIGGGMANTFLAAQ
AFGTAHRAHASTEGVTKFLRPSVAGFLMQKELDYLVGAVANP.KKPFAAIVGGSKVSSKIGVIESLLAK...VDILILGGGMIFTFYKAQ
AFGTAHRAHAST•GVTKFLKPSVAGFLLQKELDYLVGAVSNP.KRPFAAIVGGSKVSSKIGVIESLLEK...CDILLLGGGMIFTFYKAQ
AFGTAHRAHASTEGVTKFLKPSVAGFLLQKELDYLDGAVSNP.KRPFAAIVGGSKVSSKIGVIESLLEK...CDILLLGGGMIFTFYKAQ
**
*
*
**
*
*
360
.
Haloarcula
Me thanobac teri um
Methanothermus
Httn~n
Aspergi i I us
Saccharomyces
T. brucei c y t
T. brucei g l y c
Bacillus meg.
Corynebacterium
Escherichia
Thermus
Zymomonas
Triticum
cyt
Spinacia
chl
Triticum
chl
.
.
.
.
.
.
+
-
--++ .
.
.
.
.
.
.
+
.
.
.
.
.
++
-
_
_
+-
-
G •Y D I D D A N L Y D E Q W E A N S E K I
ESMLEDHRDQI TLAVDLAYEDENDDRAEQAV
. . . . D D I D E K R L S -Y L D V G S E T L M E Y S P I
IRESEAVF
G. IN ..... LGKYNEDFI INKGYIDFVEKGKQLLEEFDGQI
~PDDVAVCVDNARVEYCTKNI
PNKPIYDIGTNT
I -T E Y A K F I R D A K T I
G- ID . . . . . I K E K N R K I L Y R K N Y K K F
IKMAKKLKDKYGEKI
LTPVDVAINE~GKRIDVP
IDDI PNFPIYDIGMET
I •K I Y A E K I R E A K T
I
N N M E I G T S • LFDEEGAKIVKDLMSKAEKNGVK
I TL PVDFVTADKFDENAKTGQATVASG
I PAGWMG.
-LDCG PES SKKYAEAVTRAKQ
IV
E N V K I G S S •L F D E A G S K I V G N I
I EKAKKHNVKVVL
PVDYVTADKFAADAKTGYATDEQ
G I P D G Y M G - •L D V G E K S V E S Y K Q T I A E S K T I
L
E N T E I G D S -I F D K A G A E I V P K L M E K A K A K G V E V V L P V D F I I A D A F S A D A N T K T V T D K E G I
P A G W Q G • •L D N G P E S R K L F A A T V A K A K T
IV
G - Y S I G I S •M C E E S K L E F A R S L L K K A E D R K V Q V I L P I D H V C H T E F K A % r D S P L I T E D
•Q N I P E G H M A
- -LDIGPKTIEKYVQTIGKCKSAI
G - YSIGKS
- KCEESKLEFARSLLKKAEDRKVQVILPIDHVCHTEFKAVDS
PLITED
.QNI PEGHMA
. -LDIGPKT IEKYVQT IGKCKSAI
G • HEVGKS
- LLEEDK
I ELAKS
FMEKAKKNGVNFYMPVDVVVADDF
SNDANI
QVVS
I • EDIPSDWEG
• •LDAG PKTRE IYADVI KNSKLVI
G - H N V Q Q S •L L Q E E M K A T C T D L L
.... ASVDK IVL PVDLVAASEFNKDAEKQ
I V D L - D S I P E G W M S - -L D I G P E S V K N F G E V L S T A K T
IF
G •H D V G K S -L Y E A D L V D E A K R L L
..... T TCN I PVP SDVRVATEF SETAPATLKSV
•N D V K A D E Q I • - L D I G D A S A Q E L A E
I LKNAKT i L
G •G E V G R S - L V E E D R L D L A K D L L G R A E A L G V R V Y L
P E D V V A A E R I E A G V E T R V F P A •R A I P V P Y M G • • L D I G P K T R E A F A R A L E G A R T V F
G •V D V G K S • L C E H E L K D T V K G I F A A A E K T G C K I
HL PSDVVVAKEFKANPP
I R T I P V . S D V A A D E M I • •L D V G P K A V A A L T E V L K A S K T L V
G -L A V G K S •L V E E D K L E L A T S L I E T A K S K G V K L L L P T D V V V A D K F A A D A E
S K 7 V P A •T A I P D G W M G • • L D V G P D S I K T F A E A L D T T K T V I
G •M S V G S S • L V E E D K L D L A T S
LLAKAKEKGVS LLL PT DVVIADKFAADADSKI
V P A • S G I P D G W M G • •L D I G P D S I K T F S E A L D T T Q T V I
G -L S V G S S -L V E E D K L E L A T S
LLAKAKAKGVS LLL P SDVI IADKFAPDANSQTVPAS A I P D G W M G - oL D I G P D S V K T F N D A L D T T Q T I
I
450
-
Haloarcula
Methanobacterium
Methanothermus
Human
Aspergillus
Saccharomyces
T. brucei c y t
T. brucei g l y c
Bacillus meg.
Corynebacterium
Escherichia
Thermus
Zymomonas
Triticum
cyt
Spinacia
chl
Tritlcum
chl
+
---+
.
.
.
.
+
. . . . .
+
+
+
-
+
G •E G R A G M F E D E R F
SVGTAGVLEA TADTDC .... F SVVGGGDT SRAI EMYGMEEDEFGHVS
IAGGAY IRALTRAQ LVGVEVLQR
FANGPAGVFEQEGFS
IGTEDI LNT IASSNG .... YSI IGGGHLAA
QMGLSSG. ITH I SSGGGAS INLLAGEKLPVVE
I LTEVKMKGRK
FANGPAGVFEEQQFS
IGTEDLLNAIASSNA
.... FSV IAGGHLAAAAEKMG
I S N K •I N H I S S G G G A C 7 A F L S G E E L P A I K V L E E A R K R S D K Y I
W -N G P V G V F E W E A F A R G T K A L M D E V V K A T
S R •G C I T I I G G G D T A T C C A K W N T E D K
•V S H V S T G G G A S L E L L E G K V L P G V D A L S N I L
W -N G P P G V F E M E P F A K A T K A T L D A A V A A V Q N
•G A T V I I G G G D T A T V A A K Y G A E D K
-I S H V S T G G G A S L E L L E G K E L
PGVAAL SEKS K
W' NGP PGVF E F EKFAAGTKALLDEVVKS
S A A •G N T V T I G G G D T A T V A K K Y G V T D K
•I S H V S T G G G A S L E L L E G K E L
PGVAFL SEKK
W •N G P M G V F E M V P Y S K G T F A I A K A M G R G T H E H G L M S
I IGGGESAGAAELCGEAAR
•I S H V S T G G G A S L E L L E G K T L
PGVTIrLDEKE
W -N G P M G V F E M V P Y S K G T
FA IAKAMGRGTHEHGLMS
I IGGGDSASAAELSGFJkKR
-M S H V S T G G G A S L E L L E G K T L
PGVTVLDEKSAVVS
>>
W -N G P M G V F E L D A F A N G T K A V A E A L A E A T D
• • •T Y S V I G G G D S A A A V E K F N L A D K
•M S H I S T G G G A S L E F M E G K E L P G V V A L N D K
W -N G P M G V F E F A A F S E G T R A S P R P S S M Q H A G N D A F S V V G G G D S A A S V R V L G L N E D G F S H
I STGGGASLEYLEGKEL
PGVAILAQ
W •N G P V G V F E F P N F R K G T E I V A N A I A D S E
. . . . A F S I A C ~ _ ~ 3 D T L A A I D L F G I A D K •I S Y I S T G G G A F L E F V E G K V L
PAVAMLE ERAKK
W -N G P M G V F E V P P F D E G T L A V G Q A I A A L E
• • •G A F T V V G G G D S V A A V N R L G L K E R
•F G H V S T G G G A S L E F L E K G T L P G L E V L E G
W -N G P L G A F E I E P F D K A T V A L A K E A A A L
T KAGS L I SVAGGGDTVAALNHAGVAKD
•F S F V S T A G G A F L E W M E G K E L
PGVKALEA
W -N G P M G V F E F E K F A A G T D A
IAKQLAELTGK
•G V T T I I G G G D S V A A V E K A G L A D K
•M S H I S T G G G A S L E L L E G K P L P G V L A L D E A
W -N G P M G V F E F E K F A A G T E A
I A K K L E E I S K K •G A T T I I G G G D S V A A V E K V G V A E A
-M S H I S T G G G A S L E L L E G Q T A S W S T C S
W -N G P M G V F E F D K F A V G T E S
I A K K L A E L S K K -G V T T I I G G G D S V A A V E K V G V A D V
-M S H I S T G G G A S L E L L E G K E L P G V V A L D E G V M T R S V T V
.
**
**
•
***
71
nomes. That view is strongly supported by the
position of the Chlamydomonas sequence, which
branches robustly with other plant sequences but
below the node corresponding to the duplication
event which gave rise to the chloroplast and cytosolic isoenzymes of wheat (and presumably of
other higher plants). The position of the spinach
chloroplast P G K sequence suggests that the duplication occurred before the separation ofmonocots and dicots.
Gene duplications which resulted in enzymes
of different eukaryotic cell compartments is a recurring theme of P G K evolution, as suggested by
the portion of the tree beating the kinetoplastid
sequences (Fig. 3). It is evident that genes for
recompartmentalized P G K enzymes destined for
the glycosome, a specific glycolytic microbody
present in some but not all kinetoplastids [18],
have arisen several times independently in kinetoplastid evolution. The three P G K genes of T.
brucei and T. congolense are organized in tandem
[34, 35], the first in the array (pgkA) carries a
large insertion of roughly 80 amino acids not
found in other P G K enzymes. The resulting product is of higher molecular weight (ca. 56 kDa) and
is located in glycosomes [35], although it is not
the major glycosomal form in T. brucei [34]. The
topogenic signals involved in glycosomal precursor import are quite different from those for chloroplasts [34, 46], but for both plants and protists,
the genes for differentially compartmentalized
P G K isoenzymes arose through duplication
(Fig. 3).
ing that it represents an orthologue of other archaebacterial P G K genes (the partial Sulfolobus
P G K sequence also branches with these homologues, data not shown). The elevated rate of
amino acid substitution for the H. vallismortis sequences implied by the length of its terminal
branch very likely relates to the roughly 40 additional acidic amino acids that this sequence contains relative to its homologues from nonhalophilic organisms. An important aspect of
Fig. 3 is that eubacterial and eukaryotic P G K
genes are much more similar to each other (ca.
50~o identity) than either is to archaebacterial
sequences (only ca. 30~o amino acid identity).
This is in marked contrast to the majority of protein coding genes which have been studied in the
three urkingdoms; they depict a different urkingdom topology, showing greater similarity between
archaebacterial and eukaryotic genes relative to
eubacterial homologues [5, 12, 16, 23, 51, 52].
This discrepancy in P G K gene evolution was a
finding upon which, at least in part, Zillig's hypothesis for the origin of the eukaryotic genome via
the fusion of archaebacterial and eubacterial genomes was originally based [51, 52] (see [12] for
a review). Due to the high degree of divergence
between archaebacterial and remaining P G K sequences, the position of the archaebacterial
branch may be a long-branch artefact, as suggested by low bootstrap values in basal portion of
the tree.
Eubacterial origin of chlorophyte PGK genes
Comparison of PGK sequences from the three urkingdoms
H. vallismortis P G K is more similar to P G K from
two methanogenic archaebacteria (Fig. 3) than it
is to eubacterial or eukaryotic sequences, suggest-
Note that the portion of the P G K tree for nonphotosynthetic eukaryotes is consistent with other
molecular phylogenies regarding the branching
order of trypanosomes, Plasmodium, animals and
fungi. Note that if one disregards for a moment
all eukaryotic sequences in the figure (i.e. by sim-
Fig. 2. Alignmentof PGK sequences.Sourcesof sequencesare as givenin the legendto Fig. 3. Strictlyconservedresiduesin the
alignmentare indicatedwith an asterisk. Positivelyand negativelychargedamino acids in the Haloarculavallismortissequenceare
indicated above the alignmentwith '+' and ' - ' respectively.Gaps are indicated as dots. cyt, cytosolic;chl, chloroplast; glyc,
glycosome. ' >>' indicates that the C-terminusof the T. bruceiglycosomalsequence(YASAGTGTLSNRWSSL)is not shown.
' <<' indicates the presenceof a transit peptide. Position numberingis arbitrary.
72
3-Phosphoglycerate Kinase Gene Tree
s3/9
l°rJ~--J°I II
ss
~uu ]l
,
,
,
Species and Gene
Eukaryotea: Enzyme Compartment
Prokaryotes: Gene Cluster
Human
Mouse
Human (TS)
Chicken
Drosophila
Schistosoma
Tetrahymena
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
Aspergillus
Penicillium
Trichoderma reesei
Trichoderma viridis
Neurospora
Yarrowia
Rhizopus pgk l
Rhizopus pgk2
Kluyveromyces
Saccharomyces
Candida
Plasmodium
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
cytosol
E
U
100
100
92
68
r
Y
O
t
i
C
N
U
e
I
e
a
4t"Eukaryotic gene duplications which resulted in recompartmentallzedenzymes
991
6 7 ~
100
100
+
l
100
100
k
a
,1o
i
100
:_t1
loo I
100
I
loo
100
100
loo
loo
]
I
~
Crithidia fasciculata
Crithidia fasciculata
Trypanosoma brucei
Trypanosoma brucei
Trypanosoma brucei (56 kD)
Trypanosoma congolense
Trypanosoma congolense (56 kD)
-- Crithidia fasciculata (56 kD)
cytosol
glycosome
cytosol
glycosome
glycosome
cytosol
glycosome
glycosome
Spinach
Wheat
Wheat
Chlamydomonas reinhardii
(low GC g+) Bacillus megaterium
(low GC g+) Bacillus stearothermophilus
chloroplast
chloroplast
cytosol
chloroplast
gap-pgk-tpi
gap-pgk
(high GC g+) Corynebacterium glutamicum
(high GC g+) Mycobacterium leprae
Thermus aquaticus
Thermotoga maritima
gap-pgk-tpi
gap-pgk-orf-tpi
gap-pgk
gap-pgk-tpi
(o~-purple)Xanthobacter flavus
(~-purple) Zymomonas mobilis
(y-purple) Escherichia coil
(y-purple) Haemophilus influenzae
pgk
gap-pgk
gap-pgk-fba
pgk-fba
Methanobacterium bryanUi
Methanothermus fervidus
Haloarcula vallismortis
pgk
pgk
gap-pgk
r
G
e
n
e
s
1
.~
A
r
c
h
Fig. 3. Neighbour joining tree for PGK protein sequences constructed from pairwise estimates of numbers of amino acid substitutions obtained assuming a gamma distribution for the variabiliy of substitution rate across positions. Genes found in eukaryotic genomes are designated with open branches, those found in eubacterial and archaebacterial genomes are designated with solid
branches. Gene duplication events which resulted in subcellular recompartmentalization of enzymes are indicated with stars.
Abbreviations are: TS, testis-specific; g +, gram-positive bacteria; GC, guanosine plus cytosine content; purple, purple bacteria;
arch, archaebacteria. The scale bar at the upper left indicates 0.1 substitutions per site. PGK and TPI are fused in a single
cotranslated protein in Thermotoga [44], as indicated by underlining. Numbers above nodes indicate the bootstrap proportion for
100 replicates using the same tree construction method, numbers below nodes (or in parenthesis) indicate the bootstrap proportion using protein parsimony of phylip. Bootstrap values lower than 50/100 are indicated as ' - ' . Sources of sequences are: Aspergillus nidulans P 11977, Bacillus megaterium P24269, Bacillus stearothermophilus P 18912, Candida maltosa P41757, Chlamydomonas
73
ply covering them with a piece of paper), the eubacterial portion of the P G K tree is also consistent in many aspects with traditional molecular
systematic views of eubacterial evolution [ 50 ]: (1)
common branching of purple bacterial sequences
[45], (2) common branching of low-G + C grampositives [31], (3) common branching of high
G + C-gram-positives [31 ], (4) separate branches
for low and high G + C-gram-positives respectively [31], and (5) 'basal' positions within eubacteria for the thermophiles Thermus and Thermotoga. The 'problem' with the P G K tree is simply
that P G K genes located in eukaryotic genomes
are borne on eubacterial branches, and the plant
sequences are distinct from remaining eukaryotic
homologues. Two scenarios, each of which takes
into account current theories on the origin of the
eukaryotic nucleus can account for this finding.
As to the first of these, we assume that the
fusion hypothesis [12] is correct. It can account
for the P G K gene tree topology observed, if (and
only if) one makes the corollary assumptions that
(1) the position of the archaebacterial 'root' is a
treeing artefact, its true position lying on the 92/
100 bootstrap branch bearing non-photosynthetic
eukaryotic P G K sequences (which cannot be excluded), (2) eukaryotes lost the archaebacterial
copy and retained only the eubacterial pgk gene
subsequent to fusion, (3) the position of the plant
sequences is explained as the result of subsequent
acquisition via endosymbiotic gene transfer from
eubacteria of a pgk gene which then underwent
duplication to yield the plant isoenzymes, and (4)
the preexisting gene for the plant homologue of
cytosolic P G K from animals and fungi was simply replaced (or lost) subsequent to duplication.
But what if the fusion hypothesis is not correct?
As the second scenario, we assume that archaebacteria and eukaryotes are truly sister groups as
suggested by several papers [12]. In that case,
one can account for the P G K topology observed,
if(and only if) the corollary assumptions are made
that (1) all P G K sequences found in eukaryotic
genomes are of eubacterial (endosymbiotic) origin, (2) the ancestral plant gene (which duplicated
to yield the plant isoenzymes) is of eubacterial
origin via endosymbiotic gene transfer from chloroplasts, and (3) the preexisting gene for the plant
homologue of cytosolic P G K from animals and
fungi was simply replaced (or lost) subsequent to
duplication.
In either case, the ancestral gene which gave
rise via duplication to the plant isoenzymes are of
eubacterial origin, and any preexisting P G K enzyme of the primitive chlorophyte cytosol must
have been replaced (lost) subsequent to the duplication event. That would easily account for the
finding that Chlamydomonas reinhardtii does not
possess a cytosolic isoenzyme for P G K [43],
rather only a chloroplast enzyme. Non-green
photosynthetic protists (or Euglena) which
branched early in eukaryotic evolution might have
retained different type(s) of P G K genes. Plant
P G K sequences are slightly more similar (with
low bootstrap support) to P G K from low-G + Cgram-positive eubacteria (Bacillus) than to homologues from other eubacteria, the same is found
for another nuclear-encoded chloroplast enzyme
of eubacterial origin [20, 32]. No P G K sequence
data from cyanobacteria are currently available.
Under our working hypothesis of chloroplast origin for plant PGK, we would predict cyanobac-
reinhardtii U14912, Corynebacterium glutamicum Q01655, Crithidiafasciculata pgkA glycosome(56PGK) P25055, Crithidiafasciculata pgkB cytosolP08966, Crithidiafasciculata pgkC glycosomeP08967, Drosophila melanogaster Q01604, Escherichia coli P11665,
chicken L37101, Haemophilus influenzae U32734, human P00558, human testis-specific X05246, Kluveromyces lactis P14828,
Methanobacterium bryantii P20972, Methanothermus fervidus P20971, mouse P09411, Mycobacterium leprae U00013, Neurospora
crassa P38667, Penicilliurn citrinum P33161, Plasmodium falciparum P27362, Rhizopus niveus pgkl P29405, Rhizopus niveus pgk2
P29406, Saccharomyces cerevisiae P00560, Schistosoma mansoni L36833,Spinacia oleracea chloroplastP29409, Tetrahymena thermophila X63528, Thermotoga maritima X75437, Thermus aquaticus P09403, Trichoderma reesei P14228, Trichoderma viride P24590,
Triticum aestivum chloroplast P12782, Triticum aestivum cytosol P12783, Trypanosoma brucei pgkA glycosome(56PGK) P08891,
Trypanosoma brucei pgkB cytosolP07377, Trypanosoma brucei pgkC glycosomeP07378, Trypanosoma congolense pgkA glycosome
(p56) P41762, Trypanosoma congolense pgkC cytosolP41760,Xanthobacterflavus U08462, Yarrowia lipolytica P29407, Zymomonas
mobilis P09404.
74
terial pgk genes to be found which branch above
Bacillus PGK but below the nuclear gene for
Chlamydomonaschloroplast PGK in the pgk gene
phylogeny.
Acknowledgements
We thank C. Schnarrenberger for valuable discussions. This work was supported by grants from
the Deutsche Forschungsgemeinschaft.
References
1. Alefounder PR, Perham SA: Identification, molecular
cloning and sequence analysis of a gene cluster encoding
the class II fructose 1,6-bisphosphate aldolase, 3-phosphoglycerate kinase and a putative second glyceraldehyde
3-phosphate dehydrogenase of Escherichia coli. Mol Microbiol 3:723-732 (1989).
2. Altekar W, Rajagopalan R: Ribulose bisphosphate carboxylase activity in halophilic archaebacteria. Arch Microb~ol 153:169-174 (1990).
3. Arcari P, Dello-Russo A, Ianniciello G, Gallo M,
Bocchini V: Nucleotide sequence and molecular evolution
of the gene coding for glyceraldehyde-3-phosphate dehydrogenase in the thermoacidophilic archaebacterium Sulfolobus solfataricus. Biochem Genet 31:241-251 (1993).
4. Benachenhou N, Baldacci G: The gene for a halophilic
glutamate dehydrogenase: sequence, transcription analysis and phylogenetic implications. Mol Gen Genet 230:
345-352 (1991).
5. Brown JR, DoolittleWF: Root ofthe universal tree of life
based on ancient aminoacyl-tRNA synthase gene duplications. Proc Natl Acad Sci USA 92:2441-2445 (1995).
6. Cendrin F, Chroboczek J, Zaccia G, Eisenberg H, Mevarech M: Cloning, sequencing and expression in Escherichia coli of the gene coding for malate dehydrogenase
from the extremely halophilic archaebacterium Haloarcula
marismortui. Biochemistry 23:4308-4313 (1993).
7. Chen J-H, Gibson JL, McCue LA, Tabita FR: Identification, expression, and deduced primary structure of
transketolase and other enzymes encoded within the
form II carbon dioxide fixation operon of Rhodobacter
sphaeroides. J Biol Chem 266:20447-20452 (1991).
8. Danson MJ: Archaebacteria: the comparative enzymology of their central matabolic pathways. Adv Microbial
Physiol 29:164-231 (1988).
9. Danson MJ: Central metabolism of the archaebacteria:
an overview. Can J Microbiol 35:58-64 (1989).
10. Devereux J, Haeberli P, Smithies O: A comprehensive set
of sequence analysis programs for the vax. Nucl Acids
Res 12:387-395 (1984).
11. Dhar NM, Altekar W: Distribution of class I and class
II fructose bisphosphate aldolases in halophilic archaebacteria. FEMS Microbiol Lett 35:177-181 (1986).
12. Doolittle RF: Of Archae and Eo: What's in a name? Proc
Natl Acad Sci USA 92:2421-2423 (1995).
13. Fabry S, Heppner P, Dietmaier W, Hensel R: Cloning
and sequencing the gene encoding 3-phosphoglycerate kinase from mesophilic Methanobacterium bryantii and thermophilic Methanothermusfervidus. Gene 91: 19-25 (1990).
14. Fothergill-Gilmore LA, Michels PAM: Evolution of glyeolysis. Progr Biophys Mol Biol 59:105-238 (1993).
15. Felsenstein J: PHYLIP: Phylogeny Inference Package
(version 3.2). Cladistics 5:164-166 (1989).
16. Gogarten JP, Kibak H, Dittrich P, Taiz L, Bowman EJ,
Bowman BJ, Manolson MF, Poole RJ, Date T, Oshima
T, Konishi J, Denda K, Yoshida M: Evolution of the
vacuolar H +-ATPase: Implications for the origin of eukaryotes. Proc Natl Acad Sci USA 86:6661-6665 (1989).
17. Gupta RS, Singh B: Cloning of the HSP70 gene from
Halobacterium marismortui: relatedness of archaebacterial
HSP70 to is eubacterial homologs and a model for the
evolution of the HSP70 gene. J Bact 174:4594-4605
(1992).
18. Hannaert V, Michels P: Structure, function and biogenesis of glycosomes in Kinetoplastida. J Bioenerg
Biomembr 20:205-212 (1994).
19. Hensel R, Zwickl P, Fabry S, Lang J, Palm P: Sequence
comparison of glyceraldehyde-3-phosphate dehydrogenase from the three urkingdoms: evolutionary implication. Can J Microbiol 35:81-85 (1989).
20. Henze K, Badr A, Wettern M, Cerff R, Martin W: A
nuclear gene of eubacterial origin in Euglena gracilis reflects cryptic endosymbioses during protist evolution.
Proc Natl Acad Sci USA 92:9122-9126 (1995).
21. Henze K, Schnarrenberger C, Kellermann J, Martin W:
Chloroplast and cytosolic triosephosphate isomerase
from spinach: Purification, microsequencing and cDNA
sequence of the chloroplast enzyme. Plant Mol Biol 26:
1961-1973 (1994).
22. Higgins DG, Sharp PM: CLUSTAL: A package for performing multiple sequence alignment on a microcomputer.
Gene 73:237-244 (1988).
23. Iwabe N, Kuma K-I, Hasegawa M, Osawa S, Miyata T:
Evolutionary relationship of archaebacteria, eubacteria
and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc Natl Acad Sci USA 86:9355-9359
(1989).
24. Jin L, Nei M: Limitations of the evolutionary parsimony
method of phylogenetic analysis. Mol Biol Evol 7: 82-102
(1990).
25. Joshi PB, Dennis PP: Characterization ofparalogous and
orthologous members of the superoxide dismutase gene
family from genera of the halophilic archaebacteria. J
Bact 175:1561-1571 (1993).
26. KrOmer WJ, Arndt E: Halobacterial $9 operon: three
ribosomal protein genes are cotranscribed with genes en-
75
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
coding a tRNA-Leu, the enolase, and a putative membrane protein in the archaebacterium Haloarcula (Halobacterium) marismortui. J Biol Chem 266:24573-24579
(1991).
Kumada Y, Benson DR, Hillemann D, Hosted TJ, Rochefort DA, Thompson CJ, Wohlleben W, Tateno Y: Evolution of the glutamin synthase gene, one of the oldest
existing and functioning genes. Proc Natl Acad Sci USA
90:3009-3013 (1993).
Lam WL, Cohen A, Tsouluhas D, Doolittle WF: Genes
for tryptophan biosynthesis in the halophile archaebacterium Haloferax volcanff: the trpDFEG cluster. Proc Natl
Acad Sci USA 87:6614-6618 (1990).
Lam WL, Logan SM, Doolittle WF: Genes for tryptophan biosynthesis in the halophile archaebacterium Haloferax volcanii: the trpDFEG cluster. J Bact 174: 16941697 (1992).
Longstaff M, Raines CA, McMorrow EM, Bradbeer JW,
Dyer T: Wheat phosphoglycerate kinase: Evidence for
recombination between the genes for the chloroplastic
and cytosolic enzymes. Nucl Acids Res 17:6569-6580
(1989).
Ludwig W, Kirchhof G, Klugbauer N, Weizenegger N,
Betzl D, Ehrmann M, Hertel C, Jilg S, Tatzel R, Zitelsberger H, Liebl S, Hochberger M, Shah J, Lane D,
WallnOferP, Scheifer KH: Complete 23S ribosomal RNA
sequences of Gram positive bacteria with a tow G + C
content. System Appl Microbiol 15:487-501 (1992).
Martin W, Brinkmann H, Savona C, Cerff R: Evidence
for a chimaeric nature of nuclear genomes: eubacterial
origin of eukaryotic glyceraldehyde-3-phosphate dehydrogenase genes. Proc Natl Acad Sci USA 90:8692-8696
(1993).
Morse D, Salois P, Markovic P Hastings JW: A nuclearencoded form II RuBisCO in dinoflagellates. Science 268:
1622-1624 (1995).
Osinga KA, Swinkels BW, Gibson WC, Borst P, Veeneman GH, van Boom JH, Michels PAM, Opperdoes F:
Topogenesis of microbody enzymes: a sequence comparison of the genes for the glycosomal (microbody) and cytosolic phosphoglycerate kinases of Trypanosoma brucei.
EMBO J 4:3811-3817 (1985).
Parker HL, Hill T, Alexander K, Murphy NB, Fish WR,
Parsons M: Three genes and two isoenzymes: Gene conversion and the compartmentalization and expression of
the phosphoglycerate kinases of Trypanosoma (Nannomohas) congolense. Mol Biochem Parasitol 69:269-279
(1995).
Pelzer-Reith B, Penger A, Schnarrenberger C: Plant aldolase: cDNA and deduced amino-acid sequence of the
chloroplast and cytosol enzyme from spinach. Plant Mol
Biol 21:331-340 (1993).
Pr~lss B, Meyer HE, Holldorf AW: Characterization of
the glyceraldehyde-3-phosphate dehydrogenase from the
extremely halophilic archaebacterium Haloarcula vallismortis. Arch Microbiol 160:5-I1 (1993).
38. Rawal H, Kelkar SM, Altekar W: Ribulose-l,5-bisphosphate dependent CO2 fixation in the halophilic archaebacterium Halobacterium mediterranei. Biochem Biophys
Res Comm 156:451-456 (1988).
39. Saitou N, Nei M: The neighbor-joining method: a new
method for the reconstruction of phylogenetic trees. Mol
Biol Evol 4:406-425 (1987).
40. Sambrook J, Fritsch EF, Maniatis T: Molecular Cloning:
A Laboratory Manual. Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, New York (1989).
41. Schmidt M, Dvendsen I, Feierabend J: Analysis of the
primary structure of the chloroplast isozyme of triosephosphate isomerase from rye leaves by protein and
cDNA sequencing indicates a eukaryotic origin of its gene.
Biochim Biophys Acta 1261:257-264 (1995).
42. Schnarrenberger C, Flechner A, Martin W: Enzymatic
evidence indicating a complete oxidative pentose phosphate in the chloroplasts and an incomplete pathway in
the cytosol of spinach leaves. Plant Physiol 108:609-614
(1995).
43. Schnarrenberger C, Jacobshagen S, MOiler B, Krtlger I:
Evolution of isozymes of sugar phosphate metabolism in
green algae. In: Ogita S, Scandalios JS (eds) Isozymes:
Structure, Function, and Use in Biology and Medicine,
pp. 743-7460. Wiley-Liss, New York (1990).
44. Schurig H, Beaucamp N, Ostendorp R, Jaenicke R, Adler
E, Knowles JR: Pbosphoglycerate kinase and triosephosphate isomerase from the hyperthermophilie bacterium
Thermotoga maritima form a covalent bifunctional enzyme
complex. EMBO J 14:442-451 (1995).
45. Stackebrandt E, Murray RGE, Trtlper HG: Proteobacten'a classis nov., a name for the phylogenetic taxon that
includes the 'purple bacteria and their relatives'. Int J Syst
Bact 38:321-325 (1988).
46. Swinkels BW, Evers R, Borst P: The topogenic signal of
the glycosomal (microbody) phosphoglycerate kinase of
Crithida fasiculata resides in a carboxy-terminal extension. EMBO J 7:1159-1165 (1988).
47. Tabita RF: Carbon dioxide fixation and its regulation in
cyanobacteria. In: Fay P, van Baalen C (eds) The Cyanobacteria, pp. 95-118 Elsevier, Amsterdam (1987).
48. Vohra GB, Golding GB, Tsao N, Pearlman RE: A phylogenetic analysis based on the gene encoding phosphoglycerate kinase. J Mol Evol 34 383-395 (1992).
49. Weeden NF: Genetic and biochemical implications of the
endosymbiotic origin of the chloroplast. J Mol Evol 17:
133-139 (1981).
50. Woese CR: Bacterial evolution. Microbiol Rev 51: 221271 (1987).
51. Zillig W, Klenk H-P, Palm P, Leffers H, Ptlhler G, Gropp
F, Garrett RA: Did eukaryotes originate by a fusion event?
Endocyt C Res 6:1-25 (1989).
52. Zillig W: Comparative biochemistry of Archaea and Bacteria. Curr Opin Genet Devel 1:544-550 (1991).