Biochem. J. (2013) 449, 427–435 (Printed in Great Britain) 427 doi:10.1042/BJ20120980 Inferring the metabolism of human orphan metabolites from their metabolic network context affirms human gluconokinase activity Óttar ROLFSSON*†, Giuseppe PAGLIA*, Manuela MAGNUSDÓTTIR*, Bernhard Ø. PALSSON* and Ines THIELE*‡1 *Center for Systems Biology, University of Iceland, Sturlugata 8, 101 Reykjavik, Iceland, †University of Iceland Biomedical Center, Laeknagardur, 101 Reykjavik, Iceland, and ‡Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland,101 Reykjavik, Iceland Metabolic network reconstructions define metabolic information within a target organism and can therefore be used to address incomplete metabolic information. In the present study we used a computational approach to identify human metabolites whose metabolism is incomplete on the basis of their detection in humans but exclusion from the human metabolic network reconstruction RECON 1. Candidate solutions, composed of metabolic reactions capable of explaining the metabolism of these compounds, were then identified computationally from a global biochemical reaction database. Solutions were characterized with respect to how metabolites were incorporated into RECON 1 and their biological relevance. Through detailed case studies we show that biologically plausible non-intuitive hypotheses regarding the metabolism of these compounds can be proposed in a semi-automated manner, in an approach that is similar to de novo network reconstruction. We subsequently experimentally validated one of the proposed hypotheses and report that C9orf103, previously identified as a candidate tumour suppressor gene, encodes a functional human gluconokinase. The results of the present study demonstrate how semi-automatic gap filling can be used to refine and extend metabolic reconstructions, thereby increasing their biological scope. Furthermore, we illustrate how incomplete human metabolic knowledge can be coupled with gene annotation in order to prioritize and confirm gene functions. INTRODUCTION knowledge gaps and, subsequently, suggesting candidate gapfilling reactions is referred to as network gap filling [10–12]. Because BiGG reconstructions can be converted into a mathematical format, gap filling is carried out computationally. The SMILEY gap-filling algorithm, for example, has been successfully used to unravel novel metabolic functions in Escherichia coli [13] and to propose novel metabolic functions in humans [8]. Similar metabolic pathway discovery algorithms have found widespread use in biotechnology for pathway discovery [14]. The global human genome-scale metabolic reconstruction RECON 1 represents one of the most comprehensive BiGG metabolic reconstructions generated to date [15]. RECON 1 has found diverse applications in human metabolic research (reviewed in [16]) and has the potential to facilitate novel metabolic reaction discovery through a gap-filling approach. Although successful examples exist for bacterial metabolism [9,17], the use of eukaryotic metabolic models to this end has received limited attention [8]. Although comprehensive, RECON 1, and indeed all genomescale reconstructions, is incomplete as knowledge of metabolism accumulates over time. Incorporation of acquired biological data can be performed using a gap-filling approach. If the incorporated biological components are incompletely annotated, then hypotheses of their biological functions can be generated from the biological network context. In the present study, we set out to identify and characterize human orphan metabolites that were defined here as metabolites that have been detected in human biofluids but were not accounted for in RECON 1. The present study aimed to elucidate the metabolism of these orphan metabolites in humans and simultaneously expand the biological The wealth of genomic and metabolic information that has amounted in the past decade has highlighted a need for methods that facilitate the functional annotation of biochemicals that take part in human metabolism [1]. Components of the human metabolome can be categorized on the basis of their degree of functional annotation into well-annotated, incompletely annotated and unknown components. The first category provides the basis for BiGG (biochemically, genetically and genomically) structured metabolic networks that have found widespread use in high-throughput data analysis, biochemical engineering and pathway discovery [2,3]. The latter two categories reflect that our knowledge of human metabolism is incomplete. Underlying these are (i) genes and proteins of unknown function, (ii) proteins that are not associated with a sequence (orphan enzymes) and (iii) metabolites, for which metabolic knowledge is incomplete. Estimates report that approximately 30–50 % of all enzymes associated with EC (Enzyme Commission) numbers are not associated with genes [4–6] and that the function of over 50 % of genes in higher organisms have not been experimentally determined [7]. Dead-end metabolites, i.e. compounds whose anabolic or catabolic route within humans are not completely defined, are also numerous [8]. BiGG metabolic network reconstructions contain components of an organism’s metabolome, from information on genes and metabolic enzymes to metabolic reactants and products, and on how these components interact [9]. Incomplete genetic, genomic and biochemical information can be identified as network gaps in BiGG reconstructions. The two-step procedure of identifying Key words: gap filling, human metabolism, metabolic network, RECON 1. Abbreviations used: BiGG, biochemically, genetically and genomically; EC, Enzyme Commission; GFP, green fluorescent protein; HMDB, Human Metabolome Database; IS, internal standard; LC-MS, liquid chromatography-MS; 6PGDH; 6-phosphogluconate dehydrogenase; PLP, pyrridoxal 5phosphate; PPP, pentose phosphate pathway. 1 To whom correspondence should be addressed (email [email protected]). c The Authors Journal compilation c 2013 Biochemical Society 428 Ó. Rolfsson and others accuracy of RECON 1. To our knowledge the systematic identification and characterization of human orphan metabolites has not been carried out previously. We used a gap-filling algorithm to compute reaction hypotheses that could account for these metabolites in humans and show that the approach not only helps to reconstruct known biochemical pathways, as expected, but is also capable of generating biologically plausible nonintuitive reaction hypotheses. We exemplified this by validating one of our predictions experimentally, which resulted in the characterization of a human gluconokinase. EXPERIMENTAL Pre-processing to metabolite integration RECON 1 was downloaded from the BiGG database [18]. The reconstruction was imported into Matlab (MathWorks) using the COBRA toolbox [19]. Subsequently, we decompartmentalized all reactions in RECON 1 by placing all intracellular compartment reactions into the cytosol and removing duplicates. Extraorganism-located reactions were kept. We downloaded the KEGG Ligand database (version as of 10/2009) [20] to obtain a universal reference reaction list. For each metabolite present in RECON 1 and in the universal database, we created a transport reaction from cytosol to the extra-organism space. Furthermore, we downloaded the HMDB (Human Metabolome Database, release 2.5) [21]. We mapped the entries from the HMDB on to RECON 1 using the BiGG identifiers provided by the HMDB. the basis of a literature review was made to select the most biologically relevant solution. These criteria generated a filtered candidate solution output (S1 and S2), which is reported in the Results section (Supplementary Table S2 at http://www.biochemj. org/bj/449/bj4490427add.htm). Candidate reactions comprising the S1 and S2 solutions were investigated individually to generate plausible hypotheses for the gap filling of RECON 1. The orphan metabolites’ biological origin was determined and the reactions composing the gap-filling solutions were assessed on the basis of the underlying literature and their annotations within multiple reaction and metabolite databases. Biochemical and metabolic information was obtained from the HMDB [21], the HSDB (Hazardous Substances Database; http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?HSDB) and the KEGG database [20]. For resolving reactions, not annotated in humans, BLAST homology searches of genes encoding the resolving reaction activity were performed against the human sequence (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and using STRING [22]. On the basis of literature and electronic data, solutions were classified as: (i) accepted, solutions were composed of reactions for which genetic or biochemical data had been reported in humans or closely related species; (ii) hypothesis, solutions were biochemically plausible but genetic or biochemical information was incomplete and requires experimental investigation; (iii) discarded, solutions were composed of reactions for which no evidence could be found in humans or closely related species; and (iv) no solution, the gap-filling algorithm failed to generate a solution. All metabolic reaction and subcellular compartment abbreviations are as described in the BiGG database [18]. Computation of candidate solutions using gap filling The gap-filling algorithm has been described previously [9]. We used the implementation provided by the COBRA toolbox. Briefly, RECON 1 served as the model to which reactions from either the universal reaction list and/or the transport reaction list should be added. In order to identify candidate anabolic and catabolic solutions for each metabolite, we added a sink (sink_A: Ø→1 A) or demand reaction (DM_A: 1 A→Ø) to the model and performed the gap-filling analysis by requiring the flux through either DM_A or sink_A to be greater than zero. Up to ten solutions for each metabolite were computed. The gap-filling algorithm requires metabolites in the reconstruction to have KEGG identifiers in order to calculate metabolic reaction solutions. This restrained the metabolites identified to metabolites from the HMDB with KEGG identifiers. Note that not all metabolites in RECON 1 have a KEGG identifier, which may also limit the identification of candidate solutions from the KEGG database. Gap analysis and validation of the gap-filling solution output The gap-filling algorithm generated a total of 1131 solutions for the 343 orphan metabolites identified. In the majority of cases metabolites either had five solutions or only one (Supplementary Table S1 at http://www.biochemj.org/bj/449/bj4490427add.htm). Solutions were filtered with the following criteria in mind: (i) only solutions containing gene-associated resolving reactions were picked; (ii) if no solution met this criteria, solutions with the least number of resolving reactions and leading to integration of the metabolite within the network as opposed to transport were picked; and (iii) if no such solution was found, then the solution involving the least number of resolving reactions was picked, irrespective of the resolving reaction type. In cases where two solutions met these criteria an informed decision on c The Authors Journal compilation c 2013 Biochemical Society Overexpression of C9orf103 and enzyme assays C9orf103 isoform I (BI826543) and isoform II (BC142991) were directionally cloned into a pT7CFE1-CHis expression vector (Thermo Scientific) at the BamHI and XhoI restriction sites following amplification using the forward primers 5 -GGAATTCCATATGGGATCCGCGGCGCCGGGCGCGCTGCTG3 and 5 -GGAATTCCATATGGGATCCAAGAAACCGAGGCCCGCGGCTGC-3 for isoforms I and II respectively, and the reverse primer 5 -TATCTCGCTCGAGTTTCATTTTTAGGGTTTCCATAATTG-3 . This generated two vector constructs, pT7CFE1-GlncKIsoI and pT7CFE1-GlnKIsoII, which, following linearization with SpeI, were used in 25 μl coupled in vitro transcription/translation reactions using the Pierce® Human In Vitro Protein Expression kit for 6 h at 30 ◦ C. Confirmation of overexpressed products was performed by Western blotting using the HisProbe-HRP reagents and kit (Thermo Scientific). GFP (green fluorescent protein) expressed from pCFE-GFP (Thermo Scientific) was used as a positive control in in vitro translation reactions and the gluconokinase assay. Following translation, lysates were diluted in 1×kinase reaction buffer [100 mM phosphate, 40 mM NaCl and 2.5 mM MgCl2 (pH 7.2)] to a volume of 50 μl and desalted using spin columns with a 7.5 kDa molecular mass cut-off. A 12 μl volume of this diluted desalted lysate was then used in gluconokinase assays. The gluconokinase assay was carried out in 33 μl reaction volumes in kinase reaction buffer with 0.1 mM or 1 mM gluconate and 1 mM ATP. Purified recombinant E. coli gluconokinase (50 ng; Megazyme, specific activity 31.8 units/mg) was added and used as a positive control. Following mixing by pipetting, reaction mixtures were immediately split into 20 μl and 13 μl aliquots. The 20 μl aliquot was incubated for 2 h at room temperature (20 ◦ C), and then frozen at − 80 ◦ C for MS analysis. To the 13 μl aliquot, 2 μl Gap-filling metabolic networks post-reconstruction 429 of 6PGDH (6-phosphogluconate dehydrogenase) master mix was added and incubated at room temperature for 2 h. NADPH production was measured by monitoring fluorescence at 445 nm following excitation at 340 nm and interpolating values to that of an NADPH standard curve. The 6PGDH master mix resulted in the addition of 0.015 units of 6PGDH (Megazyme) and a 625 μM final concentration of NADP + . Following fluorescence measurements, a 3 μl aliquot was removed from the kinase reaction, diluted to 30 μl with kinase reaction buffer and the ATP content of the reaction was determined with a CellTiter-Glo assay (Promega) according to the manufacturer’s instructions. All data are results of averaging over three replicate samples and were similar in n = 3 repeats. LC-MS (liquid chromatography-MS) A UPLC system (UPLC Acquity, Waters) coupled in-line with a quadrupole-time-of-flight hybrid mass spectrometer (Synapt G2, Waters) was used. Separation was achieved by HILIC (hydrophilic interaction chromatography) analysis on an 1.7 μm Acquity amide column (2.1 mm×150 mm) (Waters). An electrospray ionization interface was used in negative mode to direct column eluent to the mass spectrometer. The flow rate was 0.4 ml/min. Mobile phase A was 10 mM acetonitrile/sodium bicarbonate (95:5) and mobile phase B contained 10 mM acetonitrile/sodium bicarbonate (5:95). The following elution gradient was used: 0 min at 99 % A, 3 min at 20 % A, 4 min at 99 % A and 5 min at 99 % A. A capillary voltage of 1.8 kV and a cone voltage of 35 V was used. MS spectra were acquired in centroid mode from m/z 50 to 1000 using a scan time of 0.35 s. Leucine enkephalin (2 ng/μl) was used as the lock mass (m/z 554.2615). [13 C6 ]Gluconate was used as the IS (internal standard) for enzyme assays when gluconate was used as the substrate, whereas gluconate was used as the IS for enzyme assays when [13 C6 ]gluconate was used as the substrate. A 5 μl volume of IS (0.6 mM) was added to 20 μl of sample. Successively 300 μl of methanol was added and samples were vortex-mixed for 5 min and centrifuged for 10 min at 10 000 g. Supernatants were recovered, dried under nitrogen and reconstituted in 80 μl of acetonitrile/water (50:50, v:v). Samples (5 μl) were injected into the UPLC-MS system. RESULTS Identification and characterization of human orphan metabolites We aimed to identify human orphan metabolites. Therefore we compared metabolites in RECON 1 with those that have been detected in human biofluids as reported within the HMDB database [21] (Figure 1A). A total of 343 metabolites were identified whose metabolic fate was not captured in RECON 1 but can be assumed to have a metabolic degradation or production route within humans on account of their detection in biofluids. The metabolites were characterized on the basis of their chemical taxonomy super-classes (Figure 2A). Organic acids, both aromatic and linear, followed by various alcohols and amino acid derivatives were found to be the most numerous. These compound classes complement the gaps caused by blocked reactions identified in RECON 1 reported previously, where the largest proportion of blocked reactions have been observed in amino acid metabolism, carbohydrate metabolism and lipid metabolism [8]. The majority of the metabolites were detected in either blood or urine and reflects that these biofluids are most commonly investigated due to ease of sample access (Figure 2B). Numerous metabolites were identified that would Figure 1 Identification of human orphan metabolites and of reactions capable of explaining their metabolic role (A) Orphan metabolites were defined and identified as metabolites that have been detected in human biofluids but not accounted for in RECON 1 (1). A modified network gap-filling algorithm was used to propose the manner in which metabolites could be incorporated into RECON 1 using non-organism-specific metabolic reactions from the KEGG database or transport reactions (2–3). Following validation of the biological relevance of proposed solutions (3), metabolites could be incorporated into an updated metabolic reconstruction (4). (B) A toy network showing the computed candidate solution types. Metabolite A is incorporated into the toy network with two solution routes (S1 and S2), which couple the production and consumption of A to metabolites already accounted for in the network (dark grey). A type II solution involves transport into the cell and incorporation into the network as shown for metabolite B. A type III solution involves transport into and out of the cell as shown for metabolite C. Note that solution routes S1 and/or S2 can be composed of multiple reactions. not normally be associated with basal metabolism, for example, drug derivatives and plant-derived antioxidants. Some of these metabolites, therefore, represent metabolic scope gaps [23], as they lie outside the biological scope of RECON 1 and/or are involved in non-metabolic pathways. Incorporation of orphan metabolites into RECON 1 In order to incorporate the identified metabolites into RECON 1, we computed candidate solutions, composed of non-organismspecific metabolic reactions from the KEGG database that could account for the metabolic production and consumption of each metabolite within RECON 1 (see the Experimental section). Metabolites that could not be incorporated using reactions from KEGG were connected to RECON 1 using an artificial transport function (Figure 1A). Up to ten solutions were generated for each metabolite (Supplementary Table S1) of which we chose two solutions routes (S1 and S2) corresponding to its metabolic production and consumption route. These were categorized on how they incorporated the metabolites into RECON 1 (Figure 1B) and whether they corresponded to biologically plausible gapfilling solutions (Figure 2C) (defined in the Experimental section). Owing to the reversibility of enzymatic reactions, we did not assign solution routes specifically as production or consumption c The Authors Journal compilation c 2013 Biochemical Society 430 Figure 2 Ó. Rolfsson and others Characterization of identified orphan metabolites and the properties of their computed solutions (A) The distribution of identified metabolites according to their chemical compound superclass. (B) The human biofluids in which the orphan metabolites were detected. (C) Computed candidate solutions for each metabolite were accepted, categorized as a valid hypotheses or rejected. Solutions were further characterized as type I, type II or type III solutions as outlined in Figure 1(B). (D) Computed candidate solutions with respect to their metabolite class. routes. A total of 119 metabolites had candidate solutions that were either accepted or corresponded to hypotheses. The remaining metabolites had solutions that were discarded (128) or had no solution at all (95). This filtered gap-filling output was the main focus of the following report (Supplementary Table S2). Accepted candidate solutions Analysis of the candidate solutions revealed that not all of the metabolites identified corresponded to orphan metabolites. In fact, solutions for 73 metabolites could be added directly to RECON 1 (Figure 2C). In all cases, the reactions associated with these metabolites, with the exception of added transport reactions, were either associated with known human genes or biochemically identified orphan enzyme activities in humans or closely related mammals. With these metabolites, the gap-filling algorithm expanded the metabolic reconstruction through addition of characterized human reactions and metabolites. For example, cholesterols and derivatives represented the most populous group of compounds with accepted solutions (Figure 2D). Five oestrogens were identified that could be added directly to RECON 1 using type I or II solutions, thereby expanding the C18 steroid biosynthesis pathway of RECON 1 (Supplementary Figure S1 at http://www.biochemj.org/bj/449/bj4490427add.htm). A total of 58 out of the 73 metabolites with accepted solutions had type I or II solutions. Having been identified in biofluids, c The Authors Journal compilation c 2013 Biochemical Society metabolites with type I solutions suggest that at least a portion of these can be produced and consumed within the cell. Metabolites with type III solutions do not add to a reconstruction knowledgebase in the same manner as type I and II solutions. Type III solutions introduce flux loops into the reconstruction. The plant metabolite phytosterol, for example, was connected to RECON 1 in three reactions: transport into the cell, conversion by -sterol reductase (EC 1.3.1.72) activity into isofucosterol and subsequent transport out of the cell. Contextualization of this type of knowledge is, however, important in terms of modelling sterol metabolism, where biological effects of these compounds could be coupled to knowledge of their metabolic interconversions. Any abnormalities of -sterol reductase activity could then be associated with biological effects of these compounds. Computed candidate solutions direct hypothesis generation As a next step we analysed how many biologically plausible hypotheses could be obtained using the gap-filling algorithm by a thorough literature review. We identified 47 orphan metabolites with plausible hypotheses. Most of these applied to organic acids, alcohols, nucleoside derivatives and amino acids (Figure 2D). A total of 22 of these metabolites could be connected to RECON 1 using type I solutions. One of these was cysteine S-sulfate, an NMDA (N-methyl-D-aspartate) receptor agonist excreted into Gap-filling metabolic networks post-reconstruction Figure 3 431 Orphan metabolites and examples of gap-filling candidate solutions Cysteine S -sulfate and D-gluconate were incorporated via type I solutions, 5α-androstan-3β,17β-diol via a type II solution and menthol with a type III solution. Solid arrows and metabolites in dark grey represent metabolic reactions and metabolites within RECON 1 before the present study. Dashed arrows represent reactions added by the gap-filling algorithm to incorporate the human orphan metabolite (white). EC numbers are indicated in bold. Metabolites in light grey were not in RECON 1 and were not detected as orphan metabolites, but were added in order to incorporate the orphan metabolite. EC numbers in italic and dotted arrows represent functions identified following a review of the literature. the urine of individuals with sulfite oxidase deficiency [24]. Cysteine S-sulfate was connected to RECON 1 through the L-cystine and L-cysteine by the addition of three enzymatic reactions and the metabolite O-acetyl-L-serine (Figure 3). The formation of cysteine S-sulfate from L-cystine was deemed plausible as glutathione-CoA-glutathione transhydrogenase (EC 1.8.4.3) is an orphan reaction in rat and bovine [25] and has also been shown to be spontaneous and reversible [26]. Conversion of cysteine S-sulfate into L-cysteine involved cysteine synthase (EC 2.5.1.47), cystathionine δ-synthase (EC 2.5.1.48) and O-acetylhomoserine aminocarboxypropyl-transferase (EC 2.5.1.49). These three enzymes have sequence similarity to two PLP (pyrridoxal 5-phosphate)-dependent enzymes of the human trans-sulfuration pathway, cystathione β-synthase (EC 4.2.1.22) and cystathionine δ-lyase (EC 4.4.1.1), which have recently been shown to have broad substrate specificity in the β-replacement reactions and in the elimination reactions that these enzymes catalyse respectively [27,28]. Cystathionine β-synthase is, for example, known to catalyse the α,β-elimination reaction of L-cysteine to produce L-serine [27,28]. Whether these two enzymes are capable of eliminating thiosulfide from cysteine S-sulfate, to generate serine and thiosulfate, has not been experimentally determined. Type I solutions were also obtained for compounds originating from coffee or tea, such as theophyllin, theobromine and caffeine. The metabolism of these compounds is fairly well characterized, although the enzymes associated with their degradation have not been fully characterized and the resulting excreted metabolites are heterogeneous. Accordingly, the solution output consisted of multiple degradation routes, none of which could be discriminated against and highlights a shortcoming of automated hypothesis generation when multiple solutions are correct or can be obtained (Supplementary Figure S2 at http://www. biochemj.org/bj/449/bj4490427add.htm). Type II solutions were obtained for 18 out of the 47 metabolites. Among these was 5α-androstane-3β,17β-diol, a steroid that has been shown to exert an inhibitory effect on prostate cancer cell migration via activation of oestrogen receptor β [29]. The candidate solution involved production from dihydrotestosterone with 3β (or 20α)-hydroxysteroid dehydrogenase (EC 1.1.1.210). Its activity has been detected in ox and rat but has not been associated with a gene in mammals [30]. A recombinant 3βhydroxy-5-steroid dehydrogenase (EC 1.1.1.145) was, however, previously shown to carry out this reaction (Figure 3) [31]. A total of seven metabolites had type III solutions. For example, the metabolic route of menthol was predicted to involve intracellular transport, oxidation of menthol to generate p-menthane-3,8-diol followed by extracellular transport of p-menthane-3,8-diol (Figure 3). Menthol is known to enter either enterohepatic circulation as a biliary metabolite or be oxidized by cytochrome P450 generating menthol diols that are then excreted into the urine [32,33]. Type III solutions were also obtained for valproic acid, an epileptic drug. These two examples highlight the need for a detailed reconstruction of xenobiotics, which are currently not captured in RECON 1. Type III solutions were also obtained for methylated nucleosides, well-known cancer markers, which are also not represented in RECON 1 as they are derived from the degradation of RNA and therefore are beyond its scope. Experimental validation of a computed hypothesis Candidate reaction solutions remain hypotheses until experimentally proven. We performed experimental confirmations of the type II solution obtained for gluconate. Gluconate is the C-1-oxidized linear derivative of glucose and a common food and drug constituent [34]. The proposed metabolism involved the oxidation of glucose followed by hydrolysis of Dglucono-1,5-lactone to generate D-gluconate that is subsequently metabolized upon phosphorylation in the PPP (pentose phosphate pathway) (Figure 3). Hexose-1-dehydrogenase (EC 1.1.1.47) and gluconolactonase (EC 3.1.1.17) activity have previously been c The Authors Journal compilation c 2013 Biochemical Society 432 Ó. Rolfsson and others Figure 4 End point enzyme assays of overexpressed C9orf103 in HeLa lysates confirms gluconokinase activity of isoform I of C9orf103 A 6PGDH-coupled enzyme assay of gluconokinase activity in HeLa lysates overexpressing isoform I or isoform II of C9orf103 or GFP. More NADPH is generated in the presence of isoform I as compared with isoform II and the negative control GFP (A). ATP end point measurements in the gluconokinase reactions show a drop in ATP concentration consistent with ATP-dependent kinase activity for isoform I (B). A similar effect on NADPH and ATP concentrations was observed in the presence of the positive control E. coli gluconate kinase (GlcnK_Ecoli). All reaction mixtures contained ATP at a final concentration of 1 mM and were diluted 10× before performing the ATP assay. reported in humans [35,36]. An enzyme with gluconokinase activity has also been previously purified and characterized from pig kidney extracts [37]. Performing a homology search of bacterial enzymes having gluconokinase activity (EC 2.7.1.12), we identified C9orf103 as a plausible candidate gene, which may be capable of providing a catabolic route for gluconate in humans. A BLAST query of isoform I of C9orf103 against the E. coli genome revealed a 40 % sequence identity to gluconokinase (gene symbol GNTK) with an E value of 8 − 40 and over 90 % sequence coverage. Furthermore, 12 out of the 14 amino acid residues that have been shown to be important for E. coli gluconokinase substrate binding are conserved in C9orf103 (Supplementary Figure S3 at http://www.biochemj. org/bj/449/bj4490427add.htm). In order to confirm gluconokinase activity of C9orf103, we overexpressed isoforms I and II of C9orf103 in human HeLa cell lysates (Supplementary Figure S4 at http://www. biochemj.org/bj/449/bj4490427add.htm) and assayed for gluconokinase activity using a 6PGDH-coupled enzyme assay. Following incubation of cell lysates with added gluconate, we detected more NADPH in lysates containing overexpressed isoform I than with isoform II or the negative control containing GFP. The NADPH concentration was similar to that observed for the E. coli gluconokinase positive control. Little or no NADPH was detected in the water negative control (Figure 4A). These data suggest that C9orf103 encodes an active gluconokinase. The difference in NADPH concentration can be largely attributed to the reduction of NADP + that accompanies the oxidation of 6phosphogluconate produced by the C9orf103 gene product. A c The Authors Journal compilation c 2013 Biochemical Society drop in ATP concentration upon gluconate addition was also observed in the lysate expressing isoform I, consistent with ATPdependent kinase activity (Figure 4B). A change in both NADPH and ATP concentration was also observed upon gluconate addition in isoform II and GFP samples, perhaps hinting at background gluconokinase activity in these lysates. To further confirm the gluconokinase activity of isoform I and completely rule out activity of isoform II or background activity, we quantified gluconate and 6-phosphogluconate in the cell lysates using LC-MS (Figure 5 and Supplementary Figure S5 at http://www.biochemj.org/bj/449/bj4490427add.htm). Component analysis showed little or no gluconate in isoform I and in the positive-control samples, whereas 6-phosphogluconate was readily detected. The opposite was true for isoform II and the negative-control samples. An experiment where uniformly labelled [13 C]gluconate was used as the substrate unequivocally demonstrates that the added gluconate is indeed converted into 6-phosphogluconate in the presence of overexpressed isoform I. No kinase activity was observed for isoform II above the negative control. This is most likely due to the absence of the phosphatebinding loop binding domain that is present in the N-terminal domain of isoform I but missing in isoform II (Supplementary Figure S3), suggesting a different function for this isoform. These data confirm gluconokinase activity in humans and support a gluconate degradation route through the PPP similar to the one observed in lower organisms [38]. Metabolites with discarded or no solutions A large proportion of the metabolites identified had either candidate solutions that were discarded or had no solutions at all (Figure 2C). Approximately 60 % of the metabolites having discarded solutions were organic acids, alcohols or amino acid derivatives, which also represented the most numerous metabolite classes. In the majority of the cases, no evidence was found to support the occurrence of the proposed solutions in humans. Some metabolites were also found to not correspond to orphan metabolites. Exogenous compounds, such as rosmarinic acid and curcumin, have been shown to be excreted following glucuronidation or glycosylation [39,40]. Similarly, compounds, such as secondary bile acids, tartarate, raffinose, melbiose and 6-hydroxynicotinic acid, which are known to be indirectly metabolized in humans by the gut microbiome, do not need to be accounted for in RECON 1. Compounds that had no solutions were mostly inorganic compounds, amino acid derivatives, organic acids (Figure 2D) and also compounds which are not associated with basal metabolism, such as arsenic, lead, ibuprofen and naproxen. DISCUSSION In the present study, RECON 1 was used as a discovery tool alongside a library of chemical reactions to computationally propose metabolic routes for human orphan metabolites found in biofluid samples. The key results include (i) the identification and characterization of 343 metabolites, (ii) metabolic hypotheses for multiple orphan metabolites and the biochemically detailed description of five of these, (iii) the experimental validation of one of the computationally proposed hypotheses leading to the characterization of C9orf103 as a human gluconokinase, and (iv) a large proportion of the identified metabolites could not be accounted for using our approach and may focus future gapfilling efforts. The results of the present study demonstrate how gap filling can focus the expansion of metabolic networks and Gap-filling metabolic networks post-reconstruction Figure 5 433 LC-MS analysis of gluconate and 6-phosphogluconate composition in HeLa lysates expressing C9orf103 Overlapped extracted ion chromatograms of gluconate (Glcn) and 6-phosphogluconate (6P-Glcn) from gluconokinase assays containing overexpressed C9orf103 isoform I (A and C) or isoform II (B) in the presence of 1 mM Glcn (A and B) or 1 mM [13 C6 ]gluconate (C). Quantification of the reaction components using ISs showed that gluconate was consumed and 6-phosphogluconate was produced only in the presence of overexpressed isoform I and of the positive control. Similarly, when [13 C6 ]gluconate was used as the substrate, [13 C6 ]6-phosphogluconate was only detected in lysates overexpressing isoform I and the positive control (D and E). facilitate metabolite and gene annotation through a metabolitecentred approach. Our approach assumed that RECON 1 correctly captured current knowledge and, therefore, that all metabolites identified were orphan metabolites. Clearly this was not the case as more than one-fifth of the metabolites could be added directly to RECON 1, thus not being true orphan metabolites. This result demonstrates that the metabolic network reconstruction process is iterative requiring periodical refinement so that the network model sustains its biological accuracy. With respect to the accepted solutions, the gap-filling algorithm therefore fulfils its role as a network gap-filling tool [9]. We generated biologically plausible hypotheses for almost 50 orphan metabolites. The notion that almost half of these have proposed anabolic and catabolic routes infers that their detection in biofluids may be associated with abnormal metabolic states, as is the case for cysteine S-sulfate (OMIMM# 272300). The computed hypotheses (Figure 3) conforms with the previously proposed metabolism of cysteine S-sulfate from L-cysteine in the pathogenesis of sulfite oxidase deficiency [38] albeit in biochemical detail. The second computed solution, through L-serine, is also plausible on the account of the catalytic promiscuity of the PLP-dependent enzymes involved towards sulfur-containing amino acids [27,28]. PLP-dependent enzymes have broad catalytic activity on account of the diverse reactivity of the PLP Schiff base intermediate [41]. For modelling of disease-associated metabolic phenotypes, such as inborn errors of metabolism [42], these types of discoveries are valuable. The remainder of the hypotheses involved type II and III solutions. Interestingly, during the preparation of the present paper the computed candidate reaction for menthol was confirmed to be catalysed by CYP2A6 in human liver microsomes by Nakanishi and co-workers [43]. Multiple approaches have been devised to derive gene or protein function [44]. With the recent surge in non- targeted metabolite profiling, the number of metabolites whose biological roles are not defined will surely increase [45]. The three solution types highlight how network gap filling directs metabolic hypotheses of orphan metabolites through contextualization of biological data. The linkage between metabolites and genes can therefore be successfully done in a semi-automated manner using a high-quality genome-scale metabolic reconstruction. Our network gap-filling approach suffers a drawback in that it is not fully automated. Since the universal reaction database (KEGG) used captures biochemical reactions identified in all domains of life, manual ranking and curation of candidate solutions was required before addition of computed reaction solutions to the network, similar to the laborious bottomup network reconstruction procedure [46]. The prioritization of predicted pathways between metabolites has been a focus point in synthetic pathway engineering, where candidate reactions between metabolites have been ranked according thermodynamic feasibility, pathway length, chemical similarity and organism specificity [14]. Automated ranking and filtering of the gap-filling output would be valuable when faced with multiple candidate solutions. We confirmed the computationally proposed hypotheses for gluconate consumption and showed that C9orf103 encodes a functional gluconokinase. The proposed metabolism of gluconate is identical with what is observed in lower organisms where it has been extensively characterized [38]. The molecular details of gluconate metabolism in mammals are less well understood, although early biochemical experiments suggest a similar degradation route. Using 14 C-labelled gluconate, Stetten and Topper [47] demonstrated that the carboxyl carbon of gluconate yields CO2 abundantly in intact rats. Because reduction of gluconate directly to glucose had been shown to not take place [47], the finding was consistent with omission of CO2 from 6-phosphogluconate by 6PDGH (EC 1.1.1.44) [47,49]. c The Authors Journal compilation c 2013 Biochemical Society 434 Ó. Rolfsson and others C-6 of gluconate was, however, observed to contribute to fatty acid synthesis, presumably following integration into glycolysis through the non-oxidative branch of the PPP [50]. These studies of gluconate metabolism took part in pioneering investigations into oxidative glucose metabolism that aided in the discovery of the PPP [51]. An active gluconokinase was later characterized and purified from rat and pig kidney [37]. We have demonstrated for the first time that gluconokinase activity is encoded within the human genome. C9orf103 was initially identified during cDNA library construction [52] and later shown to be located within the commonly deleted region of del(9q) acute myeloid leukaemia [53]. These studies did not report on the possible function of the gene. C9orf103 was, however, electronically annotated prior to the present study as a putative gluconokinase. A similarity in protein sequence does not always transfer to enzyme function [54]. Isoform II was, for example, shown in the present study to be catalytically inactive. Also, an 18 amino acid insertion within both isoforms of C9orf103 as compared with E. coli GlcnK (Supplementary Figure S3) does not appear to abolish the gluconokinase activity of isoform I. The location of this peptide insertion within the amino acid sequence of isoform I hints at a structural difference between isoform I and E. coli GntK previously observed between E.coli GlcnK and various NMP kinases that are known to catalyse the phosphorylation of a broad range of substrates [55]. Although this might suggest additional functions for the C9orf103 protein product, the gluconokinase purified from pig kidney was found to exhibit strict specificity for D-gluconate [37]. Ultimately, the result of the present study provides a missing genetic link between mammalian gluconokinase activity and our computationally proposed gluconate catabolic route previously proposed through biochemical pulse–chase experiments in rats. An anabolic route for gluconate through the oxidation of glucose was also computed. Although both activities for this conversion are found in humans, cell biology experiments will be required to confirm whether these act synergistically to generate gluconate in vivo. There is no apparent gain of oxidation of glucose through gluconate compared with glucose 6-phosphate. Both presumably yield the same pentose phosphates and two molecules of NAPDH. This result does, however, not rule out a metabolic contribution of gluconate as a nutrient with a catabolic role via the PPP following, for example, cellular uptake or as a degradation product of intracellular carbohydrates. Similar experiments would also shed light on the effect of down-regulation of C9orf103 associated with del(9q) acute myeloid leukaemia [53]. We identified more than 200 orphan metabolites for which no solution could be obtained as their metabolism was beyond the scope of RECON 1. These metabolites point to reaction pathways that should be incorporated into RECON 1 to account for metabolic phenotypes associated with, for example, phase I and II xenobiotic detoxification routes, the contribution of the human microbiome, signalling cascades, and transcription and translation. Integration of such non-metabolic reconstructions would help address orphan metabolites, whose metabolic fate lies beyond the scope of RECON 1 and subsequently better define a list of true human orphan metabolites. The finding that RECON 1 can be used to generate computational hypotheses of human biological functions is intriguing as it implies that metabolic models can be used to prioritize the discovery of human metabolic functions albeit by focusing on metabolites rather than genes. This prioritization stems from filling gaps in knowledge of human metabolism, thereby systematically expanding metabolic knowledge through a bottom-up approach. This approach does, however, not unearth functions that are globally uncharacterized as the method relies on c The Authors Journal compilation c 2013 Biochemical Society annotated functions and genes. The manner in which incomplete biological data is put into context with current metabolic knowledge is therefore the driving force behind this approach of biological target identification and subsequent functional validation. The result is an expanded metabolic reconstruction, which better captures and defines the known and unknowns of human metabolism. AUTHOR CONTRIBUTION Óttar Rolfsson designed the study, performed the experiments, analysed the data and wrote the paper. Giuseppe Paglia and Manuela Magnusdóttir performed MS analysis. Bernhard Palsson contributed to experimental design and writing of the paper. Ines Thiele designed the study, performed the experiments and contributed to the writing of the paper. FUNDING This work was supported by the European Research Council [grant number 232816]. REFERENCES 1 Furnham, N., Garavelli, J. S., Apweiler, R. and Thornton, J. M. (2009) Missing in action: enzyme functional annotations in biological databases. Nat. Chem. Biol. 5, 521–525 2 Reed, J. L., Famili, I., Thiele, I. and Palsson, B. O. (2006) Towards multidimensional genome annotation. Nat. Rev. Genet. 7, 130–141 3 Oberhardt, M. A., Palsson, B. O. and Papin, J. A. (2009) Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5, 320 4 Karp, P. D. (2004) Call for an enzyme genomics initiative. Genome Biol. 5, 401 5 Lespinet, O. and Labedan, B. (2005) Orphan enzymes? Science 307, 42 6 Chen, L. and Vitkup, D. (2007) Distribution of orphan metabolic activities. Trends Biotechnol. 25, 343–348 7 Hanson, A. D., Pribat, A., Waller, J. C. and de Crécy-Lagard, V. (2010) ‘Unknown’ proteins and ‘orphan’ enzymes: the missing half of the engineering parts list: and how to find it. Biochem. J. 425, 1–11 8 Rolfsson, O., Palsson, B. Ø. and Thiele, I. (2011) The human metabolic reconstruction Recon 1 directs hypotheses of novel human metabolic functions. BMC Syst. Biol. 5, 155 9 Reed, J. L., Patel, T. R., Chen, K. H., Joyce, A. R., Applebee, M. K., Herring, C. D., Bui, O. T., Knight, E. M., Fong, S. S. and Palsson, B. O. (2006) Systems approach to refining genome annotation. Proc. Natl. Acad. Sci., U.S.A. 103, 17480–17484 10 Kharchenko, P., Vitkup, D. and Church, G. M. (2004) Filling gaps in a metabolic network using expression information. Bioinformatics 20 (Suppl. 1), 178–185 11 Kumar, V. S. and Maranas, C. D. (2009) GrowMatch: an automated method for reconciling in silico /in vivo growth predictions. PLoS Comput. Biol. 5, e1000308 12 Yamada, T., Waller, A. S., Raes, J., Zelezniak, A., Perchat, N., Perret, A., Salanoubat, M., Patil, K. R., Weissenbach, J. and Bork, P. (2012) Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours. Mol. Syst. Biol. 8, 851 13 Orth, J. D. and Palsson, B. Ø. (2012) Gap-filling analysis of the iJO1366 Escherichia coli metabolic network reconstruction for discovery of metabolic functions. BMC Syst. Biol. 6, 30 14 Medema, M. H., Raaphorst, R. V., Takano, E. and Breitling, R. (2012) Computational tools for the synthetic design of biochemical pathways. Nat. Rev. Microbiol. 10, 191–202 15 Duarte, N. C., Becker, S. A., Jamshidi, N., Thiele, I., Mo, M. L., Vo, T. D., Srivas, R. and Palsson, B. Ø. (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc. Natl. Acad. Sci. U.S.A. 104, 1777–1782 16 Bordbar, A. and Palsson, B. O. (2012) Using the reconstructed genome-scale human metabolic network to study physiology and pathology. J. Intern. Med. 271, 131–141 17 Nakahigashi, K., Toya, Y., Ishii, N., Soga, T., Hasegawa, M., Watanabe, H., Takai, Y., Honma, M., Mori, H. and Tomita, M. (2009) Systematic phenome analysis of Escherichia coli multiple-knockout mutants reveals hidden reactions in central carbon metabolism. Mol. Syst. Biol. 5, 306 18 Schellenberger, J., Park, J., Conrad, T. and Palsson, B. (2010) BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinf. 11, 213–213 19 Schellenberger, J., Que, R., Fleming, R. M. T., Thiele, I., Orth, J. D., Feist, A. M., Zielinski, D. C., Bordbar, A., Lewis, N. E., Rahmanian, S. et al. (2011) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat. Protoc. 6, 1290–1307 Gap-filling metabolic networks post-reconstruction 20 Goto, S., Okuno, Y., Hattori, M., Nishioka, T. and Kanehisa, M. (2002) LIGAND: database of chemical compounds and reactions in biological pathways. Nucleic Acids Res. 30, 402–404 21 Wishart, D. S., Tzur, D., Knox, C., Eisner, R., Guo, A. C., Young, N., Cheng, D., Jewell, K., Arndt, D., Sawhney, S. et al. (2007) HMDB: the Human Metabolome Database. Nucleic Acids Res. 35, D521–D526 22 Jensen, L. J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A., Simonovic, M. et al. (2009) STRING 8: a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412–D416 23 Orth, J. D. and Palsson, B. Ø. (2010) Systematizing the generation of missing metabolic knowledge. Biotechnol. Bioeng. 107, 403–412 24 Tan, W.-H., Eichler, F. S., Hoda, S., Lee, M. S., Baris, H., Hanley, C. A., Grant, P. E., Krishnamoorthy, K. S. and Shih, V. E. (2005) Isolated sulfite oxidase deficiency: a case report with a novel mutation and review of the literature. Pediatrics 116, 757–766 25 Chang, S. H. and Wilken, D. R. (1966) Participation of the unsymmetrical disulfide of coenzyme a and glutathione in an enzymatic sulfhydryl-disulfide interchange. J. Biol. Chem. 241, 4251–4260 26 Stricks, W. and Kolthoff, I. M. (1951) Equilibrium constants of the reactions of sulfite with cystine and with dithiodiglycolic acid. J. Am. Chem. Soc. 73, 4569–4574 27 Chiku, T., Padovani, D., Zhu, W., Singh, S., Vitvitsky, V. and Banerjee, R. (2009) H2S biogenesis by human cystathionine γ -lyase leads to the novel sulfur metabolites lanthionine and homolanthionine and is responsive to the grade of hyperhomocysteinemia. J. Biol. Chem. 284, 11601–11612 28 Singh, S., Padovani, D., Leslie, R. A., Chiku, T. and Banerjee, R. (2009) Relative contributions of cystathionine β-synthase and γ -cystathionase to H2S biogenesis via alternative trans-sulfuration reactions. J. Biol. Chem. 284, 22457–22466 29 Guerini, V., Sau, D., Scaccianoce, E., Rusmini, P., Ciana, P., Maggi, A., Martini, P. G. V., Katzenellenbogen, B. S., Martini, L., Motta, M. and Poletti, A. (2005) The androgen derivative 5α-androstane-3β,17β-diol inhibits prostate cancer cell migration through activation of the estrogen receptor β subtype. Cancer Res. 65, 5445–5453 30 Sharaf, M. A. and Sweet, F. (1982) Dual activity at an enzyme active site: 3β,20α-hydroxysteroid oxidoreductase from fetal blood. Biochemistry 21, 4615–4620 31 Matsunaga, T., Endo, S., Maeda, S., Ishikura, S., Tajima, K., Tanaka, N., Nakamura, K. T., Imamura, Y. and Hara, A. (2008) Characterization of human DHRS4: an inducible short-chain dehydrogenase/reductase enzyme with 3β-hydroxysteroid dehydrogenase activity. Arch. Biochem. Biophys. 477, 339–347 32 Austin, C. A., Shephard, E. A., Pike, S. F., Rabin, B. R. and Phillips, I. R. (1988) The effect of terpenoid compounds on cytochrome P-450 levels in rat liver. Biochem. Pharmacol. 37, 2223–2229 33 Gelal, A., Jacob, 3rd, P., Yu, L. and Benowitz, N. L. (1999) Disposition kinetics and effects of menthol. Clin. Pharmacol. Ther. 66, 128–135 34 Singh, O. V. and Kumar, R. (2007) Biotechnological production of gluconic acid: future implications. Appl. Microbiol. Biotechnol. 75, 713–722 35 Senesi, S., Csala, M., Marcolongo, P., Fulceri, R., Mandl, J., Banhegyi, G. and Benedetti, A. (2010) Hexose-6-phosphate dehydrogenase in the endoplasmic reticulum. Biol. Chem. 391, 1–8 435 36 Kondo, Y., Inai, Y., Sato, Y., Handa, S., Kubo, S., Shimokado, K., Goto, S., Nishikimi, M., Maruyama, N. and Ishigami, A. (2006) Senescence marker protein 30 functions as gluconolactonase in l-ascorbic acid biosynthesis, and its knockout mice are prone to scurvy. Proc. Natl. Acad. Sci., U.S.A. 103, 5723–5728 37 Leder, I. G. (1957) Hog kidney gluconokinase. J. Biol. Chem. 225, 125–136 38 Peekhaus, N. and Conway, T. (1998) What’s for dinner?: Entner-Doudoroff metabolism in Escherichia coli . J. Bacteriol. 180, 3495–3502 39 Holder, G. M., Plummer, J. L. and Ryan, A. J. (1978) The metabolism and excretion of curcumin (1,7-Bis-(4-hydroxy-3-methoxyphenyl)-1,6-heptadiene-3,5-dione) in the rat. Xenobiotica 8, 761–768 40 Baba, S., Osakabe, N., Natsume, M., Yasuda, A., Muto, Y., Hiyoshi, K., Takano, H., Yoshikawa, T. and Terao, J. (2005) Absorption, metabolism, degradation and urinary excretion of rosmarinic acid after intake of Perilla frutescens extract in humans. Eur. J. Nutr. 44, 1–9 41 Toney, M. D. (2011) Controlling reaction specificity in pyridoxal phosphate enzymes. Biocheim. Biophys. Acta 1814, 1407–1418 42 Sahoo, S., Franzson, L., Jonsson, J. J. and Thiele, I. (2012) A compendium of inborn errors of metabolism mapped onto the human metabolic network. Mol. Biosyst. 10, 2545–2558 43 Miyazawa, M., Marumoto, S., Takahashi, T., Nakahashi, H., Haigou, R. and Nakanishi, K. (2011) Metabolism of ( + )- and ( − )-menthols by CYP2A6 in human liver microsomes. J. Oleo Sci. 60, 127–132 44 Eisenthal, R. and Danson, M. J. (2002), Enzyme Assays: A Practical Approach. Oxford University Press 45 Patti, G. J., Yanes, O. and Siuzdak, G. (2012) Metabolomics: the apogee of the omics trilogy. Nat. Rev. Mol. Cell Biol. 13, 263–269 46 Thiele, I. and Palsson, B. Ø. (2010) A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protoc. 5, 93–121 47 Stetten, M. R. and Topper, Y. J. (1953) Pathways from gluconic acid to glucose in vivo . J. Biol Chem. 203, 653–664 48 Reference deleted 49 Stetten, M. R. and Stetten, Jr, D. (1950) The metabolism of gluconic acid. J. Biol. Chem. 187, 241–252 50 Bloom, B. and Stetten, Jr, D. (1955) The fraction of glucose catabolized via the glycolytic pathway. J. Biol. Chem. 212, 555–563 51 Horecker, B. L. (2002) The pentose phosphate pathway. J. Biol. Chem. 277, 47965–47971 52 Strausberg, R. L., Feingold, E. A., Grouse, L. H., Derge, J. G., Klausner, R. D., Collins, F. S., Wagner, L., Shenmen, C. M., Schuler, G. D., Altschul, S. F. et al. (2002) Generation and initial analysis of more than 15000 full-length human and mouse cDNA sequences. Proc. Natl. Acad. Sci. U.S.A. 99, 16899–16903 53 Sweetser, D. A., Peniket, A. J., Haaland, C., Blomberg, A. A., Zhang, Y., Zaidi, S. T., Dayyani, F., Zhao, Z., Heerema, N. A., Boultwood, J. et al. (2005) Delineation of the minimal commonly deleted segment and identification of candidate tumor-suppressor genes in del(9q) acute myeloid leukemia. Genes Chromosomes Cancer 44, 279–291 54 Rost, B. (2002) Enzyme function less conserved than anticipated. J. Mol. Biol. 318, 595–608 55 Kraft, L., Sprenger, G. A. and Lindqvist, Y. (2002) Conformational changes during the catalytic cycle of gluconate kinase as revealed by X-ray crystallography. J. Mol. Biol. 318, 1057–1069 Received 18 June 2012/13 September 2012; accepted 15 October 2012 Published as BJ Immediate Publication 15 October 2012, doi:10.1042/BJ20120980 c The Authors Journal compilation c 2013 Biochemical Society Biochem. J. (2013) 449, 427–435 (Printed in Great Britain) doi:10.1042/BJ20120980 SUPPLEMENTARY ONLINE DATA Inferring the metabolism of human orphan metabolites from their metabolic network context affirms human gluconokinase activity Óttar ROLFSSON*†, Giuseppe PAGLIA*, Manuela MAGNUSDÓTTIR*, Bernhard Ø. PALSSON* and Ines THIELE*‡1 *Center for Systems Biology, University of Iceland, Sturlugata 8, 101 Reykjavik, Iceland, †University of Iceland Biomedical Center, Laeknagardur, 101 Reykjavik, Iceland, and ‡Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland,101 Reykjavik, Iceland Figure S1 algorithm Four metabolites derived from oestrone that have been identified in human biofluids and that were incorporated into RECON 1 using the gap-filling Selected reactions and metabolites in C18 steroid metabolism, as they appear in a decompartmentalized model of RECON 1, are shown in blue. The gap-filling solutions for 2-methoxy-estradiol-17β, 2-methoxyestrone, 2-hydroxyoestrone and 16-α-hydroxyoestrone are shown with green arrows. 16-α-Hydroxyoestrone has a type I solution, whereas the remainder have type II solutions. Note that the type II solution was also a gap-filling solution in RECON 1 in that it connected oestriol to the network, which would otherwise be caught in a loop and did not carry flux. 1 To whom correspondence should be addressed (email [email protected]). c The Authors Journal compilation c 2013 Biochemical Society O. Rolfsson and others Figure S2 Reconstruction of caffeine and theobromine metabolism Manual curation is required in order to generate biologically plausible metabolic pathways for highly interconnected metabolites. An example of where automated gap filling and hypotheses generation fell short. The primary metabolic degradation routes of caffeine, theobromine and its derivatives were through xanthine, which was then excreted as uric acid, and through the excretion of 5-acetylamino-6-amino-3-methyluric acid. The gap-filling algorithm is capable of proposing multiple degradation routes, the manual curation of which are required in order to accurately reflect caffeine metabolism. c The Authors Journal compilation c 2013 Biochemical Society Gap-filling metabolic networks post-reconstruction Figure S3 Sequence alignment of isoform I and II of C9orf103 to the amino acid sequences of the gluconokinases of E. coli Gluconokinase is encoded in E. coli by two genes, which encode a thermoresistant (GNTK ) and thermosensitive (IdnK ) gluconokinase. C9orf103 has 40 % and 39 % sequence identity to these genes respectively (BLAST data not shown). Residues that have been shown to be important for gluconic acid binding in E. coli GNTK [1] are highlighted in red. The phosphate-binding domain P-loop is highlighted in green and is absent from C9orf103 isoform II. Colouring is according to percentage identity. The alignment and Figure were generated with Jalview. c The Authors Journal compilation c 2013 Biochemical Society O. Rolfsson and others Figure S4 Western blots of coupled in vitro transcription/ translation reactions of C9orf103 isoform I (pT7CFE1-GlncKIsoI) (A) and C9orf103 isoform II (pT7CFE1-GlncKIsoII) (B) using 0.5–2.0 μg of linearized plasmid template (lanes 1–8) GFP template DNA (0.5 μg) was used in positive-control reactions. Water was used as a negative control. In total 1 μl and 3 μl from each 25 μl reaction mixture were loaded and run on 10 % Bis/Tris NuPAGE gels for 50 min. Using a supersignal West HisProbe Kit, both isoforms could be detected irrespective of DNA template amount used following transfer on to a nitrocellulose membrane. These are highlighted with red arrows. A BenchMark His-tagged Protein Standard (Invitrogen) was used to assess molecular mass using Image Lab software (Bio-Rad Laboratories). C9orf103 isoform I + was observed to have a molecular mass of 22.4 + − 0.17 kDa (calculated molecular mass 22.328 kDa). C9orf103 isoform II was observed to have a molecular mass of 29.1 − 0.22 kDa (calculated molecular mass 27.047 kDa). c The Authors Journal compilation c 2013 Biochemical Society Gap-filling metabolic networks post-reconstruction Figure S5 MS/MS (tandem MS) spectra of gluconate and 6-phosphogluconate (A) MS/MS spectrum of 6-phosphogluconate during ISO I assay. (B) MS/MS spectrum of [13 C6 ]6-phosphogluconate during ISO I* assay using [13 C6 ]gluconate. REFERENCE 1 Kraft, L., Sprenger, G. A. and Lindqvist, Y. (2002) Conformational changes during the catalytic cycle of gluconate kinase as revealed by X-ray crystallography. J. Mol. Biol. 318, 1057–1069 Received 18 June 2012/13 September 2012; accepted 15 October 2012 Published as BJ Immediate Publication 15 October 2012, doi:10.1042/BJ20120980 c The Authors Journal compilation c 2013 Biochemical Society
© Copyright 2026 Paperzz