Bioinformatics Practical 5 Reconstructing metabolic processes from

Bioinformatics Practical 5
Reconstructing metabolic processes from genome sequence
1. The shikimate pathway
First you are going to familiarise
yourself with an important
metabolic pathway, known as
the shikimate pathway. It is
shown in the figure opposite):
beginning with simple
compounds (dotted arrows
show inputs to the pathway), its
function is to synthesise
chorismate. This is a precursor
for a number of important
pathways. The diagram shows
the folate pathway that
produces dihydrofolate and
tetrahydrafolate. Other
important branches are shown
at chorismate in the Figure with
the ? symbol. These represent
the biosynthesis of
phenylalanine/tyrosine and also
tryptophan. The shikimate
pathway is not present in man,
and we rely on dietary supply of
many of its downstream
products (including the
essential amino acids
phenylalanine and tryptophan
(and folate). First let’s find this
pathway in the KEGG reference
network. Go to
http://www.genome.jp/kegg/
and follow the link to KEGG
Pathway. From here find
‘Phenylalanine, tyrosine and
tryptophan biosynthesis’ and try
to locate the pathway on the
network diagram. Write down a list of the EC numbers catalysing the steps in the
pathway to chorismate (in the case where 3 EC numbers are given you may assume
that the relevant one is 1.1.1.25). Where do the inputs to the pathway come from?
Now get an understanding of the chemistry done by this pathway. Draw the
structures of the inputs (phosphoenol-pyruvate and erythrose 4-phosphate), and the
product (chorismate). You may find it helpful to look up the first and last reaction in
the KEGG reaction database. You can do this by clicking on the EC number box on
the KEGG reference map and then following the link to ‘RN’ in the ‘Other DBs’
section.
1
2. The pathway in plants and yeast
Now go back to the KEGG pathway diagram and change the display from reference
pathway to look at the pathway in the yeast Saccharomyces cerevisiae. Find out if
yeast has the shikimate pathway? Does it also synthesise phenylalanine tyrosine
and tryptophan? Look also at the folate pathway. Does KEGG find the complete
pathway in yeast?
Plants also have the shikimate pathway. Use KEGG to examine the evidence for this
in the primitive plant Chlamydomonas reinhardtii (this is a green alga), and
Arabidopsis thaliana.
3. Apicomplexan parasites
Apicomplexan parasites are an important group of single celled eukaryotic parasites,
which typically infect animals by invading specific types of cell. They are named
after the apical complex, an organelle they share, associated with cell invasion. They
are a deeply branching phylum (you have to go a long way back before you reach
their common ancestor with other existing species). Examples of them are the
malaria parasite (genus Plasmodium), Toxoplasma, Cryptosporidium, Theileria,
Babesia and Eimeria. Plasmodium causes malaria by invading liver and red blood
cells, Toxoplasma causes a mild illness that is dangerous if you are pregnant or
immune-suppressed (e.g. AIDS patients), Cryptosporidium is water borne and
causes a mild illness and the other examples affect agriculturally important animals.
With the exception of Cryptosporidium the Apicomplexans have a chloroplast like
organelle (the apicoplast) which they gained by the evolutionary process of
endosymbiosis (a process where a primitive eukaryotic cell engulfed a
photosynthetic algal cell). They have lost any photosynthetic capability but retain
several metabolic pathways associated with chloroplasts, including fatty acid
biosynthesis and the non-mevalonate (DOXP) pathway for isoprenoid biosynthesis.
Use KEGG to see if you can find evidence for the shikimate pathway in Toxoplasma
gondii. Look also at the Leeds site metaTIGER
(http://www.bioinformatics.leeds.ac.uk/metatiger/): you need to use ‘View pathways’
and chose the organism and pathway. You should see that this site finds evidence
for the whole pathway.
Both KEGG and metaTIGER are automated genome analysis methods that were
covered in the lecture. KEGG finds enzymes using a bi-directional BLAST procedure
while metaTIGER uses a set of high quality hidden Markov models. They both have
strengths and weaknesses. KEGG can cover more enzymes (because metaTIGER
does not have hidden Markov models for every EC number), but metaTIGER is more
sensitive (it can detect more distant relationships than BLAST). To produce a really
good metabolic annotation of a genome you usually begin with the results of these
automated procedures and then work manually to produce a metabolic network that
is realistic (i.e. doesn’t have holes in important pathways such as would be predicted
by KEGG in this case).
Does the malaria parasite (Plasmodium falciparum) have the shikimate pathway?
4. The shikimate pathway in malaria?
2
In the last exercise you should have found that malaria clearly has the last enzyme
of the pathway, chorismate synthase (EC 4.3.2.5) but that there is no strong
evidence for any of the others. This is an interesting finding because this enzyme is
not known to have other functions, so why is it there if the rest of the pathway is not?
This is not uncommon in attempts to reconstruct complete metabolic networks from
genome sequence data. Chorismate synthase here appears to be redundant. We
know that the malaria parasite is highly divergent in evolutionary terms and an
interesting feature of the P. falciparum genome is its huge AT richness (80%); this
has led to very divergent proteins. It is possible that genes coding enzymes for the
other steps are just too diverged for us to detect with hidden Markov models. So this
is where metabolic reconstruction moves from an automated to a manual process.
We can start looking for very distant homologs of the remaining enzymes, and we
can also look at the biochemical literature for evidence of the pathway in malaria.
To do the first of these we can again use metaTIGER. Go to the home page and this
time use the list comparison facility. Select all the Plasmodium species and
Toxoplasma gondii (CTRL click to select more than one); set the E value cut-off to
1.0e-01 (to detect very distant homologs); tick the box for ‘Custom ec number list’
and enter all the EC numbers of the pathway in the box, separated by commas, in
pathway order. Press ‘View table’. You should see that there is weaker evidence for
at least the last 3 steps of the pathway and possibly some others, across the various
species of malaria parasite.
5. The biochemical literature
We won’t make you start searching in the literature for evidence of shikimate
biochemistry in malaria parasites, but if you were seriously trying to reconstruct a
complete network this is what you would have to do. In fact Bernhard Polsson’s
research group at University of California San Diego have spent years doing just this
to reconstruct really complete networks for organisms like E. coli and yeast. For your
interest however, here is a summary.
First, we are interested in this because it is a pathway that potentially appears in
Apicomplexan parasites that is not in man, so it is potentially a target for drugs. In
fact the herbicide glyphosphate is an inhibitor of this pathway in plants: it inhibits
EPSP synthase (2.5.1.19), and kills plants by the build up of a toxic intermediate and
stopping them from synthesising important amino acids. Interestingly glyphosphate
also kills malaria and toxoplasma parasites in vitro. This confirms the potential of this
pathway as a drug target and increases interest in the Apicomplexan variants of its
enzymes.
There is lot of literature on this, some of it from Leeds parasitologist Dr. Glenn
McConkey. It is fairly certain that the malaria parasite has limited ability to
synthesise amino acids: during the red blood cell stages, where its growth is most
rapid and which cause the symptoms of malaria, it obtains almost all that it needs
from haemoglobin breakdown. In particular it does not make chorismate as a
precursor for aromatic amino acid biosynthesis. On the other hand malaria does not
need a supply of folate to grow and appears to have a pathway for
dihydro/tetrahydrofolate biosynthesis that would require chorismate. In fact some
malaria drugs (pyrimethamine, cycloguanil, sulphonamides) inhibit this pathway.
Furthermore, malaria parasites are not killed by glyphosphate if you supply the
parasite with 4-amino benzoate (also known as pABA – para-amino benzoic acid)
(see the folate pathway diagram above), which would normally be derived from
3
chorismate. So overall there is substantial evidence that malaria parasites do some
shikimate biochemistry, and that it is essential for folate production.
Currently we don’t know any more than this. The pathway in malaria remains
enigmatic. Are the enzymes for earlier steps coded by even more divergent genes
that we haven’t detected, or perhaps catalysed by enzymes not homologous to those
found in other species? In fact almost 60% of malaria genes remain without
functional annotation, and these functions could be coded by some of these genes.
Or can the parasite obtain some shikimate pathway intermediates from other
sources?
4