Glyceraldehyde-3-Phosphate Dehydrogenase

Glyceraldehyde-3-Phosphate Dehydrogenase Gene Diversity in Eubacteria
and Eukaryotes: Evidence for Intra- and Inter-Kingdom Gene Transfer
Rainer Martin Figge, Milada Schubert, Henner Brinkmann, and Rüdiger Cerff
Institut für Genetik, Technische Universität Braunschweig, Germany
Cyanobacteria contain up to three highly divergent glyceraldehyde-3-phosphate dehydrogenase (GAPDH) genes:
gap1, gap2, and gap3. Genes gap1 and gap2 are closely related at the sequence level to the nuclear genes encoding
cytosolic and chloroplast GAPDH of higher plants and have recently been shown to play distinct key roles in
catabolic and anabolic carbon flow, respectively, of the unicellular cyanobacterium Synechocystis sp. PCC6803. In
the present study, sequences of 10 GAPDH genes distributed across the cyanobacteria Prochloron didemni, Gloeobacter violaceus PCC7421, and Synechococcus PCC7942 and the a-proteobacterium Paracoccus denitrificans and
the b-proteobacterium Ralstonia solanacearum were determined. Prochloron didemni possesses homologs to the
gap2 and gap3 genes from Anabaena, Gloeobacter harbors gap1 and gap2 homologs, and Synechococcus possesses
gap1, gap2, and gap3. Paracoccus harbors two highly divergent gap genes that are related to gap3, and Ralstonia
possesses a homolog of the gap1 gene. Phylogenetic analyses of these sequences in the context of other eubacterial
and eukaryotic GAPDH genes reveal that divergence across eubacterial gap1, gap2, and gap3 genes is greater than
that between eubacterial gap1 and eukaroytic glycolytic GapC or between eubacterial gap2 and eukaryotic Calvin
cycle GapAB. These data strongly support previous analyses which suggested that eukaryotes acquired their nuclear
genes for GapC and GapAB via endosymbiotic gene transfer from the antecedents of mitochondria and chloroplasts,
and extend the known range of sequence diversity of the antecedent eubacterial genes. Analyses of available GAPDH
sequences from other eubacterial sources indicate that the glycosomal gap gene from trypanosomes (cytosolic in
Euglena) and the gap gene from the spirochete Treponema pallidum are each other’s closest relatives. This specific
relationship can therefore not reflect organismal evolution but must be the result of an interkingdom gene transfer,
the direction of which cannot be determined with certainty at present. Contrary to this, the origin of the cytosolic
Gap gene from trypanosomes can now be clearly defined as g-proteobacterial, since the newly established Ralstonia
sequence (b-proteobacteria) branches basally to the g-proteobacterial/trypanosomal assemblage.
Introduction
Glyceraldehyde-3-phosphate
dehydrogenase
(GAPDH, phosphorylating: EC 1.2.1.12 and 1.2.1.13) is
an ancient and ubiquitously distributed enzyme of sugar
phosphate metabolism that catalyzes the reversible interconversion of glyceraldehyde-3-phosphate and 3phosphoglycerate (Fothergill-Gilmore and Michels
1993). Two very different types of phosphorylating
GAPDH enzymes with only 10% to 15% sequence similarity are distributed across eubacteria and archaebacteria; these were designated as class I and class II
GAPDH (Cerff 1995). Whereas class II GAPDH has so
far only been found in archaebacteria (Hensel et al.
1989), class I GAPDH is the classic enzyme of glycolysis and the Calvin cycle in eubacteria and eukaryotes
and has been characterized in one halophilic archaebacterium (Prüß, Meyer, and Holldorf 1993; Brinkmann and
Martin 1996).
Across eubacterial genomes, class I GAPDH genes
possess an exceptional degree of sequence diversity in
the form of gene families. For example, three highly
divergent gap genes each are found in the genomes of
Anabaena variabilis (Martin et al. 1993), Escherichia
coli (Blattner et al. 1997), and P. aeruginosa (see Pseudomonas genome project: http://www.pseudomonas.com/),
whereas two highly divergent gap genes are found in
Key words: cyanobacteria, proteobacteria, spirochetes, euglenozoa, glyceraldehyde-3-phosphate dehydrogenase, organelle evolution,
gene transfer, long-branch attraction.
Address for correspondence and reprints: Rüdiger Cerff, Institut
für Genetik, Technische Universität Braunschweig, Spielmannstr. 7, D38106 Braunschweig, Germany. E-mail: [email protected].
Mol. Biol. Evol. 16(4):429–440. 1999
q 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
the genomes of Synechocystis sp. PCC6803 (Kaneko et
al. 1996; Koksharova et al. 1998) and Bacillus subtilis
(Kunst et al. 1997). However, these genomes do not possess the same two or three gap genes. Rather they possess divergent samples of a larger GAPDH gene family,
members of which are distributed in a skewed manner
across contemporary eubacterial genomes. This view is
substantiated by the finding that other gap genes are
present in eubacterial genomes (e.g., Clostridium, Corynebacterium, Thermotoga, Thermus, and Deinococcus) that apparently are not orthologous to the gap genes
in the aforementioned sequenced genomes (Martin et al.
1993; Brown and Doolittle 1997; Viscogliosi and Müller
1998).
All eukaryotic GAPDH genes surveyed to date are
nuclear encoded and were apparently aquired from eubacteria. The vast majority of eukaryotic GAPDH genes
descend from only two distinct members of the eubacterial class I GAPDH family, genes that are designated
in cyanobacteria as gap1 and gap2 (Martin et al. 1993;
Cerff 1995; Brown and Doolittle 1997; Martin and
Schnarrenberger 1997). Eubacterial gene gap1 is closely
related to eukaryotic gene GapC, encoding, as a rule,
cytosolic GAPDH of glycolytic-gluconeogenetic function. This nuclear gene was probably acquired from the
proteobacterial antecedents of mitochondria (Smith
1989; Martin et al. 1993; Liaud et al. 1994), although
orthologous gap1 genes are present in other eubacterial
phyla, including cyanobacteria. A notable exception to
this rule is cytosolic GapC of Euglena and its glycosomal equivalent in trypanosomes, which are probably not
of mitochondrial origin because they specify a branch
on the GapC tree rooting basally to the proteo- cyano429
430
Figge et al.
bacterial divergence. Another exception is the cytosolic
GapC, only found in certain trypanosomes, which is surprisingly close to the corresponding homolog in g-proteobacteria (Michels et al. 1991; Henze et al. 1995).
The eubacterial gene gap2, on the other hand,
shows only 40% to 50% sequence similarity to the gene
gap1. It has so far been found only in cyanobacteria and
is highly similar to the nuclear gene GapAB encoding
chloroplast Calvin cycle GAPDH in plants, green and
red algae, as well as in Euglena gracilis (Brinkmann et
al. 1989; Martin et al. 1993; Kersanach et al. 1994;
Liaud et al. 1994; Henze et al. 1995). Hence, cyanobacteria and photosynthetic eukaryotes with green and red
plastids possess two divergent GAPDH enzymes that
differ with respect to their cosubstrate specificity and
their role in sugar phosphate metabolism. The glycolytic
(catabolic) enzyme both in plants (GapC) and cyanobacteria (Gap1) is NAD-specific, whereas the photosynthetic enzyme in plants (GapAB) is NADP-dependent
but active with both NADP and NAD in cyanobacteria
(Gap2) (Cerff 1978; Koksharova et al. 1998).
Some cyanobacterial genomes harbor a third member of the eubacterial gap gene family that has not been
found in eukaryotic genomes to date—gap3. The function of cyanobacterial gap3 is not known. Previous analyses suggested that gap3 has phylogenetic affinity to
a gap gene of E. coli (gapB; [Alefounder and Perham
1989]), that has recently been shown to encode a highly
specific erythrose-4-phosphate dehydrogenase (E4PDH)
rather than glyceraldehyde-3-phosphate dehydrogenase
(Zhao et al. 1995). Curiously, gap genes specific to the
Calvin cycle in Rhodobacter sphaeroides (Chen et al.
1991) and Ralstonia eutropha (previously called Alcaligenes eutrophus) (Schaferjohann, Yoo, and Bowien
1995) and Xanthobacter flavus (Meijer, van den Bergh,
and Smith 1996) are more similar to the E4PDH (gapB)
of E. coli than to the Calvin cycle–specific gap2 genes
of cyanobacteria (Henze et al. 1995), indicating that recruitment of GAPDH genes for autotrophic function has
occurred at least twice independently in eubacterial evolution.
In order to better understand the functional and
evolutionary relationships between eubacterial and eukaryotic GAPDH genes, we have used primers that will
amplify various members of the eubacterial gap gene
family to isolate and sequence gap genes from the cyanobacteria Prochloron didemni, Gloeobacter violaceus
PCC7421, Synechococcus PCC7942, and the proteobacteria Paracoccus denitrificans and Ralstonia solanacearum (previously called Pseudomonas solanacearum).
We report 10 new eubacterial GAPDH sequences that
can either be assigned to the gap1 and gap2 subfamilies
or to a cluster that contains gap3 and several related
proteobacterial sequences. We examine the phylogeny
of eubacterial and eukaryotic GAPDH genes in the context of endosymbiotic gene transfer during mitochondrial and plastid symbiosis. Furthermore, we have investigated the effect of long-branch attraction and outgroup effects on the position of the unstably branching,
rapidly evolving GAPDH sequence for GapA from the
photosynthetic protist E. gracilis.
Materials and Methods
Nomenclature
The designations for individual GAPDH genes and
proteins follow the guidelines of the Plant Gene Nomenclature Commission (Mnemonic/Numeric system of
the Mendel Database). All GAPDH genes and proteins
start with the mnemonic term ‘‘gap’’ followed by a number (‘‘gap1’’, ‘‘gap2’’, etc.) or a letter (‘‘GapA’’,
‘‘GapB’’, etc.) or both (‘‘GapA1’’, ‘‘GapA2’’, etc.). All
gene names are written in italics, with the first letter in
lowercase (gap1) or uppercase (GapA) for bacterial/organellar and nuclear-encoded genes, respectively. All
GAPDH proteins, whether encoded by bacterial/organellar or nuclear genes, have the same names as their
corresponding genes but are always written in standard
style and with the first letter in uppercase, e.g., proteins/
enzymes Gap1, GapA, GapA1, etc.
Bacterial Strains, Plasmids, and Culture Conditions
Wild type Synechococcus PCC7942 was grown
photoautotrophically in a modified BG11 medium at
258C under high-intensity fluorescent light (125
mEm22s21) as previously described (Koksharova et al.
1998). G. violaceus (SAG 7.82, Sammlung von Algenkulturen, Göttingen) was grown as indicated above, but
at low light intensities (5 mEm22s21). DNA of R. solanacearum (formerly Burgholderia solanacearumDSM9544) was purchased from the Deutsche Sammlung für Mikroorganismen.
Escherichia coli strains XL1-blue and POP13 were
grown with appropriate antibiotics according to standard
procedures (Sambrook et al. 1989). Cloning vectors
used were phage NM1149 (Murray 1983) and pBluescript-SK(1) (Stratagene). Plasmid transformation,
selection, and testing for recombinant clones was performed as described (Sambrook, Fritsch, and Maniatis
1989).
Cloning and Sequencing of Eubacterial gap Genes
Degenerate primers for two highly conserved regions at the N-terminal and C-terminal ends of GAPDH
proteins (INGFGRI, WYDNE) were used for the amplification of gap genes (95% of the coding sequence)
from genomic DNA samples. For the amplification of
gap sequences from Gloeobacter and P. didemni, the
following primers were used: G/AINGFG: 59-GSNATH
AAYGGNTTYGG-39andWYDNEW:59-CCAYTCRTT
RTCRT-ACCA-39(noncoding strand). In the case of Synechococcus and Ralstonia, slightly different primers
were used: INGFGR: 59-ATHAAYGGNTTYGGNMG39andWYDNE:59-AYTCRTTRTCRTACCA-39 (noncoding strand). Polymerase chain reaction (PCR) conditions were as follows: cycle 1: 938C for 5 min; cycles
2 to 35: 938C for 1 min, 508C for 1 min, and 728C for
2 min; and cycle 36: 728C for 5 min. All reactions were
performed in a Perkin Elmer thermocycler at a Mg21
concentration of 1.5 mM. Chromosomal DNA was prepared as described in Koksharova et al. (1998). Amplification products of corresponding size were treated with
polynucleotide kinase and Klenow DNA polymerase using protocols from the supplier (New England Biolabs).
GAPDH Gene Diversity in Eubacteria and Eukaryotes
Subsequently, the DNA fragments were cloned into
pBluescript-SK1 and both strands of at least three independent PCR-generated clones for each gene were sequenced with appropriate oligonucleotides using the dideoxy chain termination method (Pharmacia protocol).
Synechococcus and Paracoccus gap genes were
cloned from genomic libraries. For the generation of the
Synechococcus library, chromosomal DNA from an autotrophically grown, axenic culture was digested to completion with HindIII and cloned into the HindIII site of
phage lambda NM1149. In case of Paracoccus, chromosomal DNA was completely digested either with
HindIII or with EcoRI and cloned into the HindIII or
EcoRI site of the phage lambda NM1149, respectively.
Packaging of phage particles was performed as described (Hohn and Murray 1977; Kaiser and Murray
1988). Positively hybridizing clones were isolated by
plaque hybridization in 5x SSC, 0.02% PVP, 0.02% Ficoll 400, 0.1% SDS, and 50 mg/ml denatured salmon
sperm DNA. For Synechococcus screens were performed with homologous probes at 688C; the Paracoccus library was probed with the end-labeled (PNK) oligomer WYDNE at 308C. Probes for Synechococcus were
generated by PCR as described above and labeled with
a-32P dCTP by random priming. Filters were washed
twice for 30 min in 2x SSC, 0.1% SDS at the hybridization temperatures. Hybridizing clones were isolated
(Synechococcus gap2 and gap3; Paracoccus two gap
clones) and genomic fragments were subcloned into the
HindIII or EcoRI site of pBluescript-SK1 and sequenced as described above.
The gap1 gene from Synechococcus was cloned
from a Lambda FIXt library (Stratagene). For the generation of the library, genomic DNA from Synechococcus was partially digested with Sau3AI and partially refilled (see protocol of manufacturer) fragments of sizes
between 7 and 23 kb were cloned into the XhoI site of
Lambda FIXt. DNA from hybridizing clones was digested with SalI and subcloned into pBluescript-SK1
and sequenced as described above.
The nucleotide sequence data reported will appear
in the DDBJ/EMBL/GenBank International Nucleotide
Sequence Database under the accession numbers
X91235, X91236, X91237 (gap1, gap2 and gap3 of Synechococcus PCC7942); X92514 (gap1 of Ralstonia solanacearum); AJ007271, AJ007272 (gap2 and gap3 of
Prochloron didemni); AJ007273, AJ007274 (gap1 and
gap2 of Gloeobacter violaceus); AJ012158 and
AJ012157 (gap1 and gap2 of Paracoccus denitrificans
both belonging to subfamily Gap-III).
Phylogenetic Analysis
Except for P. aeroginosa gap2, which was obtained
from TIGR directly, (see: http://www.tigr.org/tdb/mdb/
mdb.html) all GAPDH sequences used were retrieved
from GenBank. Database accession numbers of the sequences included in the dataset are the following: A.
variabilis gap1, L07497; A. variabilis gap2, L07498; A.
variabilis gap3, L07499; Arabidopsis thaliana GapA,
M64114; A. thaliana GapB, M64115; Bacteroides fragilis gap1, U27541; Chlamydia trachomatis gap1,
431
U83198; Chlamydomonas reinhardtii GapA, L27668;
Chondrus crispus GapA, X73033; C. crispus GapC,
X73036; Gallus gallus GapC, K01458; Entamoeba histolytica GapC, M89790; E. coli gap1 (A), X02662; E.
coli gap, X14436; E. gracilis GapA, L21904; E. gracilis
GapC, L21903; Giardia lamblia GapC, U31911; Gracilaria verrucosa GapA, Z15102; Haemophilus influenzae gap1, U32848; Leishmania mexicana GapC,
X65220; Pinus sylvestris GapA, L26923; Pisum sativum
GapA, X15190; P. sativum GapB, X15188; P. sativum
GapC, L07500; Pseudomonas aeruginosa gap1,
M74256; R. eutropha gap1, U12422; Rhodobacter capsulatus gap3, M68914; R. sphaeroides gap, M68914;
Synechocystis PCC6803 gap1, X86375; Synechocystis
PCC6803 gap2, X86376; T. pallidum gap1, AE001255;
Trypanosoma brucei GapC, X53472; T. brucei GapCg,
X59955; Ustilago maydis GapC, X07879; Xanthobacter
flavus gap, U33064; Zea mays GapA, X07157; Zymomonas mobilis gap, M18802.
Deduced amino acid sequences were aligned with
CLUSTAL W (Thompson, Higgins, and Gibson 1994)
and refined by hand using the SEQED option of the
MUST package (Philippe 1993). After the exclusion of
gaps and the elimination of the N- and C-terminal regions in front of GINGFG and after WYDNE, the resulting alignment contains 49 sequences with 307 positions. A phylogenetic tree was constructed using Kimura distance estimation and the NJ algorithm (Saitou
and Nei 1987), and subsequently the statistical support
for all nodes was estimated by 1,000 bootstrap replicates
using the NJBOOT option of the MUST package. The
same alignment was used in a maximum-parsimony
(MP) (Swofford 1993) bootstrap analysis (100 replicates). In MP bootstrap analyses, the default settings
were used, except that (1) bootstrap values lower than
50% were retained, and (2) sequences were randomly
added for each of the 100 bootstrap replicates. The same
dataset of 49 sequences was also subjected to maximumlikelihood analyses (Adachi and Hasegawa 1996) with
the NJ DIST and local rearrangement options of
ProtML.
Results
Cloning and Sequence Analysis of Bacterial
gap Genes
We determined five partial and five complete
GAPDH sequences from three cyanobacteria and from
one a- and one b-proteobacterium (shown in fig. 1). The
prochlorophyte P. didemni (Lewin and Withers 1975)
harbors homologs to the gap2 and gap3 genes from A.
variabilis (ATCC 29413); the sequences from G. violaceus PCC7421 (Rippka, Waterbury, and Cohen-Bazire
1974) could be identified as gap1 and gap2 homologs.
As in Anabaena, three GAPDH genes are present in
Synechococcus PCC7942. Two highly divergent gap
genes were found in the a-proteobacterium P. denitrificans, and in the b-proteobacterium R. (Pseudomonas)
solanacearum, only a single gap1 gene was amplified.
A comparison of the 10 deduced protein sequences
for Gap1, Gap2, and Gap3 determined in this study with
432
Figge et al.
GAPDH Gene Diversity in Eubacteria and Eukaryotes
some of their respective homologs in other eubacteria
and eukaryotes (fig. 1) revealed unexpectedly high similarities between the euglenozoan GapC sequences from
Euglena and Trypanosoma brucei (75% similarity) on
the one hand and the Gap1 sequence of the spirochete
T. pallidum (Fraser et al. 1998) on the other (67% and
66% similarity, respectively). This euglenozoan/spirochaete subgroup has a quite low overall similarity relative to all other members of the Gap1/GapC family
(48% to 61% based on the Gap1/GapC dataset used for
tree construction in fig. 2), which coincides with the
presence of three unique conserved insertions at positions 22A–22C (-K/GLL-), 164A (-G-), and 302A–302E
(-LPNEK/R-), respectively (see fig. 1). Trypanosoma
and Euglena share another unique insertion at position
60A–60H (-KSDANLAE- and -KSKPSVAK-, respectively) which is not present in Treponema. All insertions, except G-164A, contain charged residues, suggesting that they are solvent exposed and located at the
surface of the native subunit.
Phylogenetic Analysis of GAPDH Sequences
We selected 32 different organisms harboring 49
GAPDH proteins for phylogenetic analyses. Three hundred and seven positions were used to construct phylogenetic trees by distance matrix (Kimura correction) and
maximum-parsimony methods. Since the resulting topologies were almost identical, only the phylogenetic
tree resulting from the distance matrix analysis is shown
in figure 2. In order to estimate the robustness of internal
branches, bootstrap proportions for both NJ and MP
trees (1,000 and 100 replicates, respectively) were calculated and are indicated at the corresponding nodes.
The resulting tree shows three major subgroups, containing cyanobacterial gap genes that are related to the
two distinct genes gap1 and gap2 and a third cluster
that contains, in addition to the gap3 sequences, several
gap3-related proteobacterial genes. For the three different categories of Gap genes/proteins defined by bootstrap values higher than 90%, we chose the operational
designations type Gap-I, type Gap-II, and type Gap-III.
These definitions are based on phylogenetic criteria and
are independent of the individual, species-specific gene
designations which usually have no phylogenetic meaning and, hence, lead to confusions in intertaxonic and
evolutionary comparisons (e.g., genes gapB and GapB
of E. coli and higher plants, respectively; see fig. 2)
433
In the Gap1/GapC (Gap-I) subtree, a group of proteobacterial sequences is evident that includes the Gap1
sequences from E. coli and R. solanacearum and the
sequences for GapC of the eukaryotic cytosol that are
arguably of mitochondrial origin (Martin et al. 1993;
Henze et al. 1995; Martin and Schnarrenberger 1997).
A Gap sequence from B. fragilis that belongs to the
bacteroides/flavobacterium group branches robustly with
the Gap1 sequence from the b-proteobacterium Ralstonia. The cyanobacterial Gap1 sequences of Synechocystis,
Synechococcus, Anabaena, and Gloeobacter are basal to
sequences from the proteobacterial subdivisions. The
position of C. trachomatis Gap1 could not be resolved
in these analyses. Whereas the NJ and MP methods
place it with cyanobacteria, though with low bootstrap
support, in the ProtML analysis (using NJDIST and the
local-rearrangement option), it is more closely related to
proteobacteria. The GapC sequences of the two Euglenozoa and the spirochete Treponema branch robustly together at the base of the Gap1/GapC (Gap-I) subtree in
figure 2.
The Gap2/GapAB (Gap-II) subtree contains Gap2
sequences from cyanobacteria and the nuclear-encoded
chloroplast enzymes of plants (GapA and GapB). The
Gap2 sequence of Gloeobacter branches basally among
the Gap2 sequences analyzed (fig. 2). Branching orders
within the cyanobacterial Gap2 cluster vary considerably and do not receive high bootstrap support in any
analysis. The position of the GapA sequence from E.
gracilis is very unstable and assumes different branching positions depending on species sampling and the
phylogenetic analysis performed (see below and Discussion).
The Gap-III subtree can be subdivided into three
separate groups: first, it carries a cyanobacterial Gap3
branch which also includes a closely related sequence
from the a-proteobacterium R. capsulatus that was recently established in the course of the Rhodobacter genome project. Sequence identity between the R. capsulatus Gap sequence and cyanobacterial Gap3 sequences
is on the order of 60% to 64%, comparable to the identity between Synechococcus and Anabaena Gap3 (67%).
The branch bearing cyanobacterial Gap3 sequences is
quite long, suggesting an elevated evolutionary rate for
these sequences.
Gap-III carries another fast-evolving group with sequences related to E. coli GapB, that was shown to have
←
FIG. 1.—Alignment of derived amino acid sequences Gap1, Gap2, and Gap3 from Synechococcus with their corresponding homologs in
eubacteria and eukaryotes in three separate blocks representing subfamilies Gap-I, Gap-II, and Gap-III (see fig. 2). Only the reference sequences
(Synechococcus Gap1, Gap2, Gap3) are written in full. Amino acids not identical to the reference are shown. Identical residues are replaced by
dots. The numbering system is in accordance with the one based on the three-dimensional structure of Bacillus stearothermophilus GAPDH.
P188 denotes the presence (Gap1) or absence (Gap2) of Pro-188 indicative of NAD specificity and dual cosubstrate specificity with NAD and
NADP, respectively (Clermont et al. 1993). The only known exception to this rule is glycosomal GapC of trypanosomes (sequence 7), which
has a valine in position 188 and which has been shown to be strictly NAD-specific (Misset et al. 1987). Val-188 is also present in Treponema
but not in Euglena GapC (sequences 5 and 6). Insertions at positions 22A–22C, 164A, and 302A–302E are specific to T. pallidum, E. gracilis
and T. brucei and are indicated by asterisks above the alignment. The arrowhead (.) at the beginning of the sequence indicates the presence
of a transit peptide. Missing residues in partial sequences or indels are indicated by dashes; double points at the C-terminal end denote termination
codons. The strict-consensus sequence is shown below the alignment. Percentage values at the bottom of the sequence were obtained by pairwise
comparison. Sequences from this study are as follows: Synechococcus Gap1, Gap2, Gap3; R. solanacearum Gap1; G. violaceus Gap1 and Gap2;
P. didemni Gap1 and Gap3 and P. denitrificans Gap1 and Gap2 sequences which belong to subfamily Gap-III. Sources for all other sequences
are indicated in Materials and Methods.
434
Figge et al.
FIG. 2.—Phylogenetic tree constructed by the neighbor-joining method with the Kimura formula for the estimation of distances. Bootstrap
values indicate the number of times that a given node was detected out of 1,000 neighbor-joining replicates (upper value) or out of 100 maximumparsimony replicates (lower value). Only bootstrap values above 50% are indicated. Dashes at internodes designate topologies that differed in
the MP bootstrap analysis from the NJ analysis shown and which, in addition, were supported by MP bootstrap values higher than 50 percent.
The scale bar indicates 0.1 amino acid substitutions per nonsynonymous site. Closed Diamonds and arrowheads indicate gene duplications and
gene transfer events, respectively. Sequences established in this study are shown in bold.
erythrose-4-phosphate dehydrogenase activity (Zhao et
al. 1995). In addition, a third group of quite slowly
evolving sequences from a-, b-, and g-proteobacteria is
part of this cluster. The topology of these latter sequences is in agreement with the generally accepted rRNA
phylogeny. This group contains a Gap sequence from R.
sphaeroides that shares only 45% identity with the Gap3
sequence from R. capsulatus. The same degree of divergence is observed when the two newly established
Paracoccus sequences are compared, demonstrating that
the Gap-III cluster is much more divergent than the GapI or Gap-II subtrees.
GAPDH Gene Diversity in Eubacteria and Eukaryotes
435
FIG. 3.—Influence of outgroup species sampling and LBA on the position of Euglena. (A), schematic demonstration of the LBA influence:
distant outgroup sequences pull fast-evolving Euglena GapA from its ‘‘correct’’ position on the green algal branch to the most basal position
of the tree. B, C, and D represent Gap2/GapAB (Gap-II) NJ topologies which are rooted with (B) 33 outgroup sequences, (C) 27 outgroup
sequences, or (D) unrooted, leading to a decreased attraction of Euglena GapA toward the basis of the tree. Outgroup sequences in B are as in
figure 2. In topology C, the following six Gap1/GapC (Gap-I) outgroup sequences have been omitted: E. histolytica, G. lamblia, Leishmania
mexicana, H. influenzae, E. coli, and cytosolic GapC of T. brucei. Only NJ bootstrap values above 40% are shown.
Influence of Outgroup Sampling on the Topology of
Subtree Gap-II
Subtree Gap-II carries only genes from cyanobacteria and photosynthetic eukaryotes with green and red
plastids, such as chlorophytes (landplants and green algae), red algae, and Euglena. The position of Euglena
is surprising, since it does not fall into the eukaryotic
clade but occupies a basal position below all cyanobacteria except the supposedly primitive cyanobacterium
Gloeobacter (see Discussion). This unusual basal position of Euglena may be due to a treeing artifact known
as ‘‘long-branch attraction’’ (LBA, Felsenstein 1978). In
brief, LBA causes rapidly evolving sequences to assume
an artifactually basal position in phylogenetic analysis
in the presence of distantly related outgroup sequences.
In order to examine the influence that LBA might have
on the position of Euglena GapA, we constructed data
sets consisting of fewer or no outgroup sequences to the
Gap-II subtree and repeated the phylogenetic analysis.
These results are shown in figure 3. Figure 3A schematically summarizes the suspected effect that LBA
might have on these data: Inclusion of distantly related
outgroups (long branches by definition) attract longbranch (fast-evolving) ingroup sequences to an artifactual basal position.
Figure 3B shows the basal position of Euglena
GapA in the presence of 33 outgroup sequences (as in
fig. 2). Figure 3C shows the result with the same method
using six fewer distantly related outgroup (GapC) sequences, whereby Euglena GapA moves up in the topology, branching above all cyanobacterial Gap2 se-
quences sampled. In the absence of outgroups to the
Gap-II subtree (fig. 3D), the position of Euglena GapA
moves even further up in the tree, above the rhodophyte
GapA branch, forming a moderately supported branch
(bootstrap value 58%) together with the chlorophyte sequences. Interestingly, also Gloeobacter makes an upward movement in the unrooted subtree Gap-II and
branches now together with Anabaena and Prochloron,
although weakly supported by bootstrap analyses (56%).
Discussion
We have isolated and characterized GAPDH sequences from the cyanobacteria Synechococcus
PCC7942 (genes gap1, gap2, and gap3), P. didemni
(gap2 and gap3), and G. violaceus (gap1 and gap2),
from the b-proteobacterium R. solanacearum (gap1) and
the a-proteobacterium P. denitrificans (two gap genes
belonging to the subfamily Gap-III). Previous studies of
GAPDH gene evolution have indicated that the nuclear
genes for higher-plant GAPDH were acquired from eubacteria in the context of the endosymbiotic origins of
organelles, the gene for GapC of the cytosol having been
obtained from the antecedents of mitochondria and the
gene for GapAB of chloroplasts having been acquired
from the antecedents of plastids (Liaud, Zhang, and
Cerff 1990; Martin et al. 1993; Liaud et al. 1994; Henze
et al. 1995). However, in those studies, lineage-sampling
among those members of the eubacterial GAPDH gene
family that ultimately gave rise to the eukaryotic nuclear
genes GapC and GapAB, respectively (eubacterial gap1
436
Figge et al.
and gap2), was rather limited. In this study, we have
shown that gap1, gap2, and gap3 genes are found not
only in Anabaena (Martin et al. 1993) but are quite
broadly distributed across cyanobacteria, including suspectedly primitive forms such as Gloeobacter and phycobilisome-lacking forms such as Prochloron (Giovannoni et al. 1988; Turner et al. 1989; Nelissen et al.
1995). Furthermore, we have shown that the phylogenetic distribution of gap1 genes extends beyond cyanobacteria and g-proteobacteria, since the b-proteobacterium R. solanacearum was shown here to possess a
gene of the gap1 type. In the course of phylogenetic
analysis of these data, we searched available databases
and identified two additional gap1 genes from the genomes of C. trachomatis and T. pallidum and a member
of the gap3 family in the genome of the a-proteobacterium R. capsulatus.
Our neighbor-joining GAPDH tree shown in figure
2 is in agreement with the maximum-likelihood GAPDH
tree recently published by Viscogliosi and Müller
(1998), which carries a number of additional GAPDH
sequences from Gram-positive and thermophilic bacteria
and parabasalid protists which were not included in the
present study. The following discussion will concentrate
on the phylogenetic implications inferred form subtrees
Gap-I, Gap-II and Gap-III, whose topologies are fairly
stable and not dependent on the inclusion of the complete set of class I GAPDH sequences presently available.
Eubacterial gap1 Genes as Antecedents of Nuclear
Genes Encoding Cytosolic GapC Enzymes in
Eukaryotes (Subtree Gap-I)
On the basis of comparative biochemistry and phylogenetic analyses (Gray 1992), it is now well etablished
that mitochondria derive from a-proteobacteria. GapC
sequences from eukaryotes cluster firmly with gap1
genes from b- and g-proteobacteria (fig. 2), to the exclusion of the basally branching cyanobacterial gap1 homologs. This position of eukaryotic GapC can be most
straightforwardly interpreted as reflecting a mitochondrial (a-proteobacterial) origin of the eukaryotic nuclear
genes. No gap1 homologs have yet been found in aproteobacteria, but this interpretation clearly predicts
that in time, a-proteobacterial gap1 sequences will be
found. The two amitochondriate protozoa G. lamblia
and E. histolytica also possess GAPDH enzymes of suspected mitochondrial origin, suggesting that these organisms indeed once harbored mitochondria (Henze et
al. 1995; Liaud et al. 1997). This is in agreement with
evidence from numerous recent studies that suggest a
secondary loss of mitochondria in these lineages (Clark
and Roger 1995; Keeling and McFadden 1998; Philippe
and Adoutte 1998; Roger et al. 1998) and is consistent
with the view that the common ancestor of all extant
eukaryotes possessed a mitochondrion-like organelle
(Martin and Müller 1998).
In our analyses, the position of C. trachomatis cannot be inferred with confidence, and different methods
place it either with proteobacteria or with cyanobacteria.
In general, the position of Chlamydia in eubacterial phy-
logeny is rather controversial and depends on the molecule of choice (Viale et al. 1994; Eisen 1995). It should
be noted, however, that in a recent analysis of group one
sigma factors, the chlamydiae share a very robust common branch with proteobacteria (Gruber and Bryant
1997).
Treponema Gap1 and euglenozoan GapC share a
robustly supported common branch rooting at the basis
of the Gap-I subtree (fig. 2). In addition, the sequences
from these three organisms (Treponema, Euglena, and
Trypanosoma) share three unique insertions (see Results, fig. 1, and below), suggesting that they have a
common evolutionary origin which is distinct from that
of all other GapC/Gap1 sequences and which apparently
antedates the emergence of mitochondria.
While the majority of eukaryote lineages including
G. lamblia and other amitochondriate diplomonads have
glycolytic GAPDH enzymes belonging to the Gap-I subfamily (GapC enzymes of mainly mitochondrial origin),
Viscogliosi and Müller (1998) have recently shown that
parabasalid flagellates are exceptional in that they adopted a glycolytic GAPDH descending from an independent (eubacterial?) lineage which may be loosely attached to subtree Gap-III. Parabasalid flagellates are
primitive protists with a fermentative metabolism. They
are characterized by the absence of mitochondria and
the presence of an unusual metabolic organelle, the hydrogenosome. It remains to be established whether or
not this enigmatic, novel GAPDH is also present in other, so far not studied, amitochondriate or even mitochondriate protists.
Intra-Kingdom Gene Transfer Between the Phyla
Proteobacteria and Bacteroides/Flavobacteria
(Subtree Gap-I)
A surprising finding is the extremely close relationship between the Gap-I sequences from B. fragilis
and the b-proteobacterium R. solanacearum (80% identity and 100% NJ-bootstrap support). This close relationship of two Gap sequences from organisms that do
not belong to the same eubacterial phylum can clearly
not represent organismal evolution but is most likely the
result of a horizontal gene transfer from a proteobacterium to an ancestral member of the bacteroides/flavobacterium phylum. This transfer must have occurred after
the separation of b- and g-proteobacteria, since the bproteobacterial Gap1 sequence from Ralstonia is significantly closer related to that of Bacteroides (80%) than
it is to the g-proteobacterial E. coli sequence (70%).
Euglenozoan Cytosolic and Glycosomal GapC Genes
Are Unusually Close to gap1 Genes from gProteobacteria and the Spirochete Treponema,
Respectively (Subtree Gap-I)
Euglenozoan GAPDH sequences sampled here represent the most deeply branching lineage on the Gap-I
subtree and, hence, are difficult to interpret as mitochondrial descendants. The two classes of Euglenozoa
represented by Euglena (Euglenoidea) and Trypanosoma (Kinetoplastidea) differ with respect to the localization of their glycolytic enzymes. Whereas Euglena has
GAPDH Gene Diversity in Eubacteria and Eukaryotes
a complete enzymatic machinery for glucose degradation in the cytosol, Trypanosoma performs glycolysis
mainly in the glycosome, an organelle unique to kinetoplastids (Michels and Hannaert 1994). Notably, glycosomal GAPDH of kinetoplastids branches robustly with
cytosolic GAPDH in Euglena, indicating recompartmentalization of this enzyme during euglenozoan evolution
(Henze et al. 1995; Wiemer et al. 1995). In addition,
some kinetoplastids possess a second, distinct isoenzyme in the cytosol (Michels et al. 1991; Hannaert et
al. 1992), which is very closely related to enterobacterial
Gap1 and might have been acquired by kinetoplastids
through a prokaryote-to-eukaryote gene transfer event
that did not entail the establishment of organelles (Henze et al. 1995). This scenario is further substantiated by
the present finding that the phylogenetic position of the
gap sequence from the b-proteobacterium Ralstonia represents an outgroup to the g-proteobacterial/trypanosomal group (fig. 2). This basal position of Ralstonia, together with the absence of the cytosolic equivalent in
the deeper-branching kinetoplastids (Trypanoplasma
[Wiemer et al. 1995]), indicates that the lateral transfer
occured fairly recently in euglenozoan evolution.
While cytosolic GapC of Trypanosoma seems to be
derived from a g-proteobacterial Gap1, glycosomal
GapC of Euglenozoa is surprisingly close to a homolog
present in the spirochete Treponema (100% bootstrap
significance; see fig. 2 and alignment in fig. 1). This
raises the question of the direction and context in which
this horizontal gene transfer may have occurred. If the
gap1 gene was transferred from a spirochete to the ancestor of the Euglenozoa (scenario I), other spirochetes
would be expected to harbor a Treponema gap1 ortholog. So far, only one other spirochetal gap gene from
Borrelia burgdorferi, has been sequenced (see the Borrelia genome database and Fraser et al. [1997]). This
gap gene, however, is clearly not part of the Gap-I subtree (Viscogliosi and Müller 1998). Therefore, under
scenario I, the common ancestor of Treponema and Borrelia would have possessed both divergent gap genes,
and the present state would be the result of selective
loss and retention during the streamlining process, leading to the extremely small genomes of these parasites
(1.1 MB and 0.8 MB).
The alternative hypothesis of a gene transfer in the
opposite, eukaryote to prokaryote, direction (scenario II)
raises other, even more puzzling questions concerning
the origin of the glycosomal GapC gene and the biological context for its transfer to spirochetes. Few prokaryotic genes have been claimed to be of eukaryotic origin,
and mechanisms of their transfer remain highly speculative (for review, see Smith, Feng, and Doolittle 1992).
With the present data, a clear-cut decision concerning
the direction of the transfer cannot be made but will
await the sequencing of further spirochetal gap genes.
Scenario I may also be viewed in the context of
earlier suggestions that glycosomes are of endosymbiotic origin. Support for a xenogenous origin of microbodies, including glycosomes, glyoxysomes, and peroxisomes, is so far based on the presence of some enzymes specific to all microbodies, their way of multi-
437
plication and biogenesis, and most important, on the
presence of similar import signals in all proteins imported (Opperdoes et al. 1998). In case of a cryptic endosymbiosis of an ancestral spirochete, we would expect
some other glycosomal proteins to be closely related to
their homologs in extant spirochetes. Several analyses
performed with glycolytic enzymes other than GAPDH
demonstrated that this is not the case (data not shown).
Therefore, an endosymbiotic scenario in which a spirochete is seen as the ancestor of the glycosome is not
supported by the present data.
Rhodobacter capsulatus Harbors a Cyanobacterial
gap3 Homolog (Subtree Gap-III)
To date, gap3 homologs have only been described
in cyanobacteria. Recently, the R. capsulatus genome
project uncovered the presence of a gap gene in this aproteobacterium which branches robustly with cyanobacterial gap3 sequences just below the Synechococcus
homolog in all analyses performed (fig. 2).
This observation may be explained in alternative
ways: (1) Rhodobacter could have obtained the gap3
gene through a lateral gene transfer from an ancestral
cyanobacterium, or (2) the position of Rhodobacter on
the Gap-III subtree reflects organismal evolution. The
latter interpretation suggests that the common ancestor
of (a-) proteobacteria and cyanobacteria possessed gap3
genes and that these have been lost during the process
of genome evolution in the lineages leading to the sequenced genomes of the g-proteobacteria H. influenzae,
which possesses only gap1, and E. coli, which possesses
gap1 and two other gap genes, that, however, do not fall
into the classes designated as gap1–3 here (Henze et al.
1995). As more proteobacterial genomes are sequenced,
other proteobacteria may become known that also possess gap3 genes.
Is Gloeobacter an Ancestral or a Highly Derived
Cyanobacterium (Subtrees Gap-I and Gap-II)?
It is generally accepted that eukaryotes received
their photosynthetic apparatus through the engulfment
of an ancient cyanobacterium that was subsequently
transformed into plastids (Gray and Doolittle 1982).
Furthermore, the majority of the data are compatible
with a monophyletic origin of all plastids (primary endosymbionts; Gray 1992; Martin, Somerville, and Loiseaux-DeGoër 1992; Loiseaux-DeGoër 1994; Delwiche
and Palmer 1997). The general picture emerging from
16S rRNA, tufA, atpB, and psbA phylogenies shows that
all cyanobacteria, including plastids, are part of a massive radiation that happened rather late in eubacterial
evolution (Morden et al. 1992; Douglas 1994; Nelissen
et al. 1995). The phylogenetic analysis of Gap2 sequences also supports a monophyletic origin of primary plastids and, in addition, an early divergence of Gloeobacter
(fig. 2). Similarly, Gloeobacter appears as one of the
early-branching organisms of the cyanobacterial radiation in 16S rRNA (Wilmotte 1994; Nelissen et al. 1995)
and tufA phylogenies (Koehler et al. 1997). This position
is in agreement with the unique morphology of Gloeobacter, the only cyanobacterial species known to lack
438
Figge et al.
thylakoid membranes (Rippka, Waterbury, and CohenBazire 1974). Additional peculiar morphological features include the lack of sulfoquinovosyl diacylglycerol
lipids (Selstam and Campbell 1996) and an unusual
structure and localization of phycobilisomes (Guglielmi,
Cohen-Bazire, and Bryant 1981). However, in the present study, the early divergence of Gloeobacter is only
supported if an outgroup to the Gap-II tree is present
(fig. 3B and C). In the absence of an outgroup, Gloeobacter and Anabaena form a weakly supported group
(fig. 3D). On the other hand, a very close relationship
between these two organisms is suggested in the Gap-I
subtree. Under the assumption that Gloeobacter is indeed the most primitive cyanobacterium known, Gap1
phylogeny can only be explained by a horizontal gene
transfer from Anabaena to Gloeobacter. Conversely,
Gloeobacter could also be a highly derived organism,
assuming that the position of Gloeobacter in the Gap1
subtree correctly reflects the evolution of the organism
and that the position in the Gap2 subtree is the result of
a treeing artifact. This in turn is in contradiction with
several other phylogenetic studies performed (see
above) that rather support the ancestral state of this organism. More analyses with other molecular markers
will be necessary to clarify this issue.
Influence of Species Sampling on the Position of
Euglena in Subtree Gap-II
The position of Euglena GapA in fig. 2 is at odds
with the hypothesis of Gibbs (Gibbs 1978), who argued
that Euglena’s plastids arose through engulfment of a
(eukaryotic) chlorophyte by a phagocytosing, nonphotosynthetic kinetoplastid host. Data from completely sequenced chloroplast genomes firmly attest to the chlorophytic ancestry of Euglena’s plastids (Martin et al.
1998). It is therefore surprising that Euglena GapA
branches below all cyanobacterial gap genes sampled,
with the exception of Gloeobacter (fig. 2). This, in addition to the conspicuously long branch observed for
Euglena GapA in previous analyses (Henze et al. 1995),
suggested that the position of Euglena GapA might be
due to an artifact of phylogenetic construction known as
long branch attraction (LBA) (Felsenstein 1978).
As shown in figure 3B–D, reducing the number of
outgroup sequences leads to a progressive movement of
Euglena GapA towards the top of the Gap-II subtree
until it branches above the rhodophyte GapA branch in
the unrooted Gap-II topology. This latter position is
much more compatible with chloroplast-encoded protein
data that confirm the chlorophytic ancestry of Euglena’s
plastids (Martin et al. 1998). It is thus congruent with
the view that the GapA gene of Euglena descends from
a GAPDH gene of plastid origin that was donated from
the secondary endosymbiont (chlorophyte) to the host
(kinetoplastid) nucleus, where the gene product was successfully rerouted to the organelle of its genetic origin.
Notably, nuclear-encoded chloroplast GAPDH from other secondary endosymbionts do not belong to the
GapAB group but are, rather, duplicated genes of the
Gap1/GapC (Gap-II) type (Liaud et al. 1997).
In summary, the complex architecture of the universal class I GAPDH tree is due to multiple successive
gene duplications, intra- and interkingdom gene transfer
(endosymbiotic and nonendosymbiotic), selective loss/
retention of genes, and lineage-specific differences in
evolutionary rate. In spite of this complexity, a large
fraction of GAPDH gene diversity falls into three major
categories, represented by subtrees Gap-I, Gap-II, and
Gap-III (fig. 2), carrying genes related to cyanobacterial
genes gap1, gap2, and gap3, first described for A. variabilis. All genes present in extant eukaryotes, except
those from parabasalid flagellates, are derived from eubacterial genes gap1 and gap2 and were transferred to
the nucleus, probably in connection with the acquisition
of mitochondria (gap1 → GapC) and chloroplasts (gap2
→ GapA), respectively. Glycosomal GapC of Euglenozoa is a notable exception. It seems to antedate the symbiotic event leading to mitochondria and is specifically
related to Gap1 of the spirochete T. pallidum, suggesting
that it spread by interkingdom gene transfer the direction
of which remains elusive at the present time.
Acknowledgments
We thank Roger Hiller and Pamela Wrench (Sydney) for DNA samples of P. didemni, Frank Hören
(GBF Braunschweig) for genomic DNA from P. denitrificans and Petra Brandt (GBF Braunschweig) for her
help with sequencing. William Martin is acknowledged
for valuable discussions and critical comments on the
text. We also thank the three referees for their competent
and constructive comments on the manuscript. Major
financial support, including a Ph.D. stipend for R.M.F.,
has been received from the DFG priority program ‘‘The
molecular basis of plant evolution’’. M.S. received a stipend from the HSP-II-Frauenförderung der TU Braunschweig. H.B. gratefully acknowledges additional financial support from the CNRS.
LITERATURE CITED
ADACHI, J., and M. HASEGAWA. 1996. Model of amino acid
substitution in proteins encoded by mitochondrial DNA. J.
Mol. Evol. 42:459–468.
ALEFOUNDER, P. R., and R. N. PERHAM. 1989. Identification,
molecular cloning and sequence analysis of a gene cluster
encoding the class II fructose 1,6-bisphosphate aldolase, 3phosphoglycerate kinase and a putative second glyceraldehyde 3-phosphate dehydrogenase of Escherichia coli. Mol.
Microbiol. 3:723–732.
BLATTNER, F. R., G. R. PLUNKETT, C. A. BLOCH et al. (17 coauthors). 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453–1474.
BRINKMANN, H., R. CERFF, M. SALOMON, and J. SOLL. 1989.
Cloning and sequence analysis of cDNAs encoding the cytosolic precursors of subunits GapA and GapB of chloroplast glyceraldehyde-3-phosphate dehydrogenase from pea
and spinach. Plant Mol. Biol. 13:81–94.
BRINKMANN, H., and W. MARTIN. 1996. Higher-plant chloroplast and cytosolic 3-phosphoglycerate kinases: a case of
endosymbiotic gene replacement. Plant Mol. Biol. 30:65–
75.
GAPDH Gene Diversity in Eubacteria and Eukaryotes
BROWN, J. R., and W. F. DOOLITTLE. 1997. Archaea and the
prokaryote-to-eukaryote transition. Microbiol. Mol. Biol.
Rev. 61:456–502.
CERFF, R. 1978. Glyceraldehyde-3-phosphate dehydrogenase
(NADP) from Sinapis alba: steady state kinetics. Phytochemistry 17:2061–2067.
CERFF, R. 1995. The chimeric nature of nuclear genomes and
the antiquity of introns as demonstrated by the GAPDH
gene system. Pp. 205–227 in M. GÕ and P. SCHIMMEL, eds.
Tracing biological evolution in protein and gene structures.
Elsevier, Amsterdam.
CHEN, J. H., J. L. GIBSON, L. A. MCCUE, and F. R. TABITA.
1991. Identification, expression, and deduced primary structure of transketolase and other enzymes encoded within the
form II CO2 fixation operon of Rhodobacter sphaeroides.
J. Biol. Chem. 266:20447–20452.
CLARK, C. G., and A. J. ROGER. 1995. Direct evidence for
secondary loss of mitochondria in Entamoeba histolytica.
Proc. Natl. Acad. Sci. USA 92:6518–6521.
CLERMONT, S., C. CORBIER, Y. MELY, D. GERARD, A. WONACOTT, and G. BRANLANT. 1993. Determinants of coenzyme
specificity in glyceraldehyde-3-phosphate dehydrogenase:
role of the acidic residue in the fingerprint region of the
nucleotide binding fold. Biochemistry 32:10178–10184.
DELWICHE, C. F., and J. D. PALMER. 1997. The origin of plastids and their spread via secondary endosymbiosis. Pp. 53–
86 in D. BHATTACHARYA, ed. Origins of algae and their
plastids. Springer Verlag, Wien.
DOUGLAS, S. E. 1994. Chloroplast origins and evolution. Pp.
91–118 in D. A. BRYANT, ed. The molecular biology of
Cyanobacteria. Kluwer Academic Publishers, The Netherlands.
EISEN, J. A. 1995. The RecA protein as a model molecule for
molecular systematic studies of bacteria: comparison of
trees of RecAs and 16S rRNAs from the same species. J.
Mol. Evol. 41:1105–1123.
FELSENSTEIN, J. 1978. Cases in which parsimony and compatibility methods will be positively misleading. System. Zool.
27:401–410.
FOTHERGILL-GILMORE, L. A., and P. A. M. MICHELS. 1993.
Evolution of glycolysis. Prog. Biophys. Molec. Biol. 59:
105–235.
FRASER, C. M., S. CASJENS, W. M. HUANG et al. (38 co-authors). 1997. Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 390:580–586.
FRASER, C. M., S. J. NORRIS, G. M. WEINSTOCK et al. (33 coauthors). 1998. Complete genome sequence of Treponema
pallidum, the syphilis spirochete. Science 281:375–388.
GIBBS, S. 1978. The chloroplasts of Euglena may have evolved
from symbiotic green algae. Can. J. Bot. 56:2883–2889.
GIOVANNONI, S. J., S. TURNER, G. J. OLSEN, S. BARNS, D. J.
LANE, and N. R. PACE. 1988. Evolutionary relationships
among cyanobacteria and green chloroplasts. J. Bacteriol.
170:3584–3592.
GRAY, M. W. 1992. The endosymbiont hypothesis revisited.
Int. Rev. Cytol. 141:233–357.
GRAY, M. W., and W. F. DOOLITTLE. 1982. Has the endosymbiont hypothesis been proven? Microbiol. Rev. 46:1–42.
GRUBER, T. M., and D. A. BRYANT. 1997. Molecular systematic
studies of eubacteria, using sigma70-type sigma factors of
group 1 and group 2. J. Bacteriol. 179:1734–1747.
GUGLIELMI, G., G. COHEN-BAZIRE, and D. A. BRYANT. 1981.
The structure of Gloeobacter violaceus and its phycobilisomes. Arch. Microbiol. 129:181–189.
HANNAERT, V., M. BLAAUW, L. KOHL, S. ALLERT, F. R. OPPERDOES, and P. A. MICHELS. 1992. Molecular analysis of
the cytosolic and glycosomal glyceraldehyde-3-phosphate
439
dehydrogenase in Leishmania mexicana. Mol. Biochem.
Parasitol. 55:115–126.
HENSEL, R., P. ZWICKL, S. FABRY, J. LANG, and P. PALM. 1989.
Sequence comparison of glyceraldehyde-3-phosphate dehydrogenases from the three urkingdoms: evolutionary implication. Can. J. Microbiol. 35:81–85.
HENZE, K., A. BADR, M. WETTERN, R. CERFF, and W. MARTIN.
1995. A nuclear gene of eubacterial origin in Euglena gracilis reflects cryptic endosymbioses during protist evolution.
Proc. Natl. Acad. Sci. USA 92:9122–9126.
HOHN, B., and K. MURRAY. 1977. Packaging recombinant
DNA molecules into bacteriphages particles invitro. Proc.
Natl. Acad. Sci. USA 74:3259.
KAISER, K., and N. E. MURRAY. 1988. The use of phage lambda replacement vectors in the construction of representative
genomic DNA libraries. Pp. 1–48 in D. M. GLOVER, ed.
DNA cloning: a practical approach. IRL Press, Oxford.
KANEKO, T., S. SATO, H. KOTANI et al. (24 co-authors). 1996.
Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence
determination of the entire genome and assignment of potential protein-coding regions. DNA Res. 3:109–136.
KEELING, P. J., and G. I. MCFADDEN. 1998. Origins of microsporidia. Trends Microbiol. 6:19–23.
KERSANACH, R., H. BRINKMANN, M. F. LIAUD, D. X. ZHANG,
W. MARTIN, and R. CERFF. 1994. Five identical intron positions in ancient duplicated genes of eubacterial origin. Nature 367:387–389.
KOEHLER, S., C. F. DELWICHE, P. W. DENNY, L. G. TILNEY, P.
WEBSTER, R. J. M. WILSON, J. D. PALMER, and D. S. ROOS.
1997. A plastid of probable green algal origin in apicomplexan parasites. Science 275:1485–1489.
KOKSHAROVA, O., M. SCHUBERT, S. SHESTAKOV, and R.
CERFF. 1998. Genetic and biochemical evidence for distinct
key functions of two highly divergent GAPDH genes in
catabolic and anabolic carbon flow of the cyanobacterium
Synechocystis sp. PCC 6803. Plant Mol. Biol. 36:183–194.
KUNST, F., N. OGASAWARA, I. MOSZER et al. (151 co-authors).
1997. The complete genome sequence of the gram-positive
bacterium Bacillus subtilis. Nature 390:249–256.
LEWIN, R. A., and N. W. WITHERS. 1975. Extraordinary pigment composition of a prokaryotic alga. Nature 256:735–
737.
LIAUD, M. F., U. BRANDT, M. SCHERZINGER, and R. CERFF.
1997. Evolutionary origin of cryptomonad microalgae: two
novel chloroplast/cytosol-specific GAPDH genes as potential markers of ancestral endosymbiont and host cell components. J. Mol. Evol. 44:S28-S37.
LIAUD, M. F., C. VALENTIN, W. MARTIN, F. Y. BOUGET, B.
KLOAREG, and R. CERFF. 1994. The evolutionary origin of
red algae as deduced from the nuclear genes encoding cytosolic and chloroplast glyceraldehyde-3-phosphate dehydrogenases from Chondrus crispus. J. Mol. Evol. 38:319–
327.
LIAUD, M. F., D. X. ZHANG, and R. CERFF. 1990. Differential
intron loss and endosymbiotic transfer of chloroplast glyceraldehyde-3-phosphate dehydrogenase genes to the nucleus. Proc. Natl. Acad. Sci. USA 87:8918–8922.
LOISEAUX-DEGOËR, S. 1994. Plastid lineages. Pp. 138–177 in
F. E. ROUND and D. J. CHAPMAN, eds. Progress in phycological research. Biopress Ltd., Bristol, U.K.
MARTIN, W., H. BRINKMANN, C. SAVONA, and R. CERFF. 1993.
Evidence for a chimeric nature of nuclear genomes: eubacterial origin of eukaryotic glyceraldehyde-3-phosphate dehydrogenase genes. Proc. Natl. Acad. Sci. USA 90:8692–
8696.
440
Figge et al.
MARTIN, W., and M. MÜLLER. 1998. The hydrogen hypothesis
for the first eukaryote. Nature 392:37–41.
MARTIN, W., and C. SCHNARRENBERGER. 1997. The evolution
of the Calvin cycle from prokaryotic to eukaryotic chromosomes: a case study of functional redundancy in ancient
pathways through endosymbiosis. Curr. Genet. 32:1–18.
MARTIN, W., C. C. SOMERVILLE, and S. LOISEAUX-DEGOËR.
1992. Molecular phylogenies of plastid origins and algal
evolution. J. Mol. Evol. 35:385–404.
MARTIN, W., B. STÖBE, V. GOREMYKIN, S. HANSMANN, M.
HASEGAWA, and K. V. KOWALLIK. 1998. Gene transfer to
the nucleus and the evolution of chloroplasts. Nature 393:
162–165.
MEIJER, W. G., E. R. VAN DEN BERGH, and L. M. SMITH. 1996.
Induction of the gap-pgk operon encoding glyceraldehyde3-phosphate dehydrogenase and 3-phosphoglycerate kinase
of Xanthobacter flavus requires the LysR-type transcriptional activator CbbR. J. Bacteriol. 178:881–887.
MICHELS, P. A., and V. HANNAERT. 1994. The evolution of
kinetoplastid glycosomes. J. Bioenerg. Biomembr. 26:213–
219.
MICHELS, P. A., M. MARCHAND, L. KOHL, S. ALLERT, R. K.
WIERENGA, and F. R. OPPERDOES. 1991. The cytosolic and
glycosomal isoenzymes of glyceraldehyde-3-phosphate dehydrogenase in Trypanosoma brucei have a distant evolutionary relationship. Eur. J. Biochem. 198:421–428.
MISSET, O., J. VAN BEEUMEN, A. M. LAMBEIR, R. VAN DER
MEER, and F. R. OPPERDOES. 1987. Glyceraldehyde phosphate dehydrogenase from Trypanosoma brucei: comparison of the glycosomal and cytosolic isoenzymes. Eur. J.
Biochem. 162:501–508.
MORDEN, C. W., C. F. DELWICHE, M. KUHSEL, and J. D. PALMER. 1992. Gene phylogenies and the endosymbiotic origin
of plastids. BioSystems 28:75–90.
MURRAY, N. E. 1983. Phage lambda and molecular cloning.
Pp. 395–432 in R. W. HENDRIX, ed. Lambda II Monograph
13. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
NELISSEN, B., Y. VAN DE PEER, A. WILMOTTE, and R. DE
WACHTER. 1995. An early origin of plastids within the cyanobacterial divergence is suggested by evolutionary trees
based on complete 16S rRNA sequences. Mol. Biol. Evol.
12:1166–1173.
OPPERDOES, F. R., P. A. M. MICHELS, C. ADJE, and V. HANNAERT. 1998. Organelle and enzyme evolution in trypanosomatids. Pp. 229–253 in G. H. COOMBS, K. VICKERMANN,
M. A. SLEIGH, and A. WARREN, eds. Evolutionary relationships among protozoa. Chapman & Hall, London.
PHILIPPE, H. 1993. MUST, a computer package of management
utilities for sequences and trees. Nucleic Acids Res. 21:
5264–5272.
PHILIPPE, H., and A. ADOUTTE. 1998. The molecular phylogeny
of eukaryota: solid facts and uncertainties. Pp. 25–56 in G.
H. COOMBS, K. VICKERMANN, M. A. SLEIGH, and A. WARREN, eds. Evolutionary relationships among protozoa. Chapman & Hall, London.
PRÜß, B., H. E. MEYER, and A. W. HOLLDORF. 1993. Characterization of the glyceraldehyde 3-phosphate dehydrogenase from the extremely halophilic archaebacterium Haloarcula vallismortis. Arch. Microbiol. 160:5–11.
RIPPKA, R., J. B. WATERBURY, and G. COHEN-BAZIRE. 1974.
A cyanobacterium which lacks thylakoids. Arch. Microbiol.
100:419–436.
ROGER, A. J., S. G. SVARD, J. TOVAR, C. G. CLARK, M. W.
SMITH, F. D. GILLIN, and M. L. SOGIN. 1998. A mitochondrial-like chaperonin 60 gene in Giardia lamblia: evidence
that diplomonads once harbored an endosymbiont related to
the progenitor of mitochondria. Proc. Natl. Acad. Sci. USA
95:229–234.
SAITOU, N., and M. NEI. 1987. The neighbor-joining method:
a new method for reconstructing phylogenetic trees. Mol.
Biol. Evol. 4:406–425.
SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS. 1989. Molecular cloning—a laboratory manual. Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, NY.
SCHAFERJOHANN, J., J. G. YOO, and B. BOWIEN. 1995. Analysis
of the genes forming the distal parts of the two cbb CO2
fixation operons from Alcaligenes eutrophus. Arch. Microbiol. 163:291–299.
SELSTAM, E., and D. CAMPBELL. 1996. Membrane lipid composition of the unusual cyanobacterium Gloeobacter violaceus sp. PCC 7421, which lacks sulfoquinovosyl diacylglycerol. Arch. Microbiol. 166:132–135.
SMITH, M. W., D. F. FENG, and R. F. DOOLITTLE. 1992. Evolution by acquisition: the case for horizontal gene transfers.
Trends Biochem. Sci. 17:489–493.
SMITH, T. L. 1989. Disparate evolution of yeasts and filamentous fungi indicated by phylogenetic analysis of glyceraldehyde-3-phosphate dehydrogenase genes. Proc. Natl.
Acad. Sci. USA 86:7063–7066.
SWOFFORD, D. L. 1993. PAUP: phylogenetic analysis using
parsimony. Version 3.1.1. Laboratory of Molecular Systematics, Smithsonian Institution, Washington, D.C.
THOMPSON, J. D., D. G. HIGGINS, and T. J. GIBSON. 1994.
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680.
TURNER, S., T. BURGER-WIERSMA, S. J. GIOVANNONI, L. R.
MUR, and N. R. PACE. 1989. The relationship of a prochlorophyte Prochlorothrix hollandica to green chloroplasts.
Nature 337:380–382.
VIALE, A. M., A. K. ARAKAKI, F. C. SONCINI, and R. G. FERREYRA. 1994. Evolutionary relationships among eubacterial
groups as inferred from GroEL (chaperonin) sequence comparisons. Int. J. Syst. Bacteriol. 44:527–33.
VISCOGLIOSI, E., and M. MÜLLER. 1998. Phylogenetic relationships of the glycolytic enzyme, glyceraldehyde-3-phosphate
dehydrogenase, from parabasalid flagellates. J. Mol. Evol.
47:190–199.
WIEMER, E. A., V. HANNAERT, P. R. L. A. VAN DEN IJSSEL, J.
VAN ROY, F. R. OPPERDOES, and P. A. MICHELS. 1995. Molecular analysis of glyceraldehyde-3-phosphate dehydrogenase in Trypanoplasma borelli: an evolutionary scenario of
subcellular compartmentation in kinetoplastida. J. Mol.
Evol. 40:443–454.
WILMOTTE, A. 1994. Molecular evolution and taxonomy of
cyanobacteria. Pp. 1–25 in D. A. BRYANT, ed. The molecular biology of Cyanobacteria. Kluwer Academic Publishers, The Netherlands.
ZHAO, G., A. J. PEASE, N. BHARANI, and M. E. WINKLER.
1995. Biochemical characterization of gapB-encoded erythrose 4-phosphate dehydrogenase of Escherichia coli K-12
and its possible role in pyridoxal 59-phosphate biosynthesis.
J. Bacteriol. 177:2804–2812.
AXEL MEYER, reviewing editor
Accepted December 7, 1998