Section of Bioinformatics Mapping gene expression data onto genome-scale human metabolic models A thesis submitted in Partial Fulfillment of the Requirements for the Degree of ”Diplom Ingenieur in Biomedical Informatics” at the University for Health Sciences, Medical Informatics and Technology by Michaela Willi Supervisors Univ.-Prof. DI Dr. Zlatko Trajanoski DI(FH) Dr. Stephan Pabinger Univ.-Prof. Dr. habil. Matthias Dehmer Hall in Tyrol, August 2012 Confirmation of the Supervisor Betreuerbestätigung I hereby declare to have supervised the present thesis and consequently approve its submission with a positive assessment. Hiermit bestätige ich die vorliegende Abschlussarbeit betreut zu haben und ich befürworte damit die Abgabe der von mir insgesamt positiv benoteten Arbeit. .............................................................................. Date and signature of the supervisor Datum und Unterschrift des Betreuers .............................................................................. Name of the supervisor in upper-case letters Name des Betreuers in Blockbuchstaben Acceptance by the study management Annahme durch das Studienmanagement date/am ....................................... by/von ........................................ i Abstract Many diseases, such as diabetes, are caused by malfunctions of the human metabolism. Obesity is worldwide a growing health problem and provokes many metabolic diseases. The reconstruction of human metabolic models enable a better research of these underlying functions within the metabolism and a better approach of computable applications for analysis and visualization of the metabolism. So far, two different human metabolic networks, the Edinburgh Human Metabolic Network (EHMN) and the Human Recon 1, as well as several tissue-specific models have been published. The objective of the thesis was mapping gene expression data onto genome-scale metabolic models. For this purpose two different approaches were done in this thesis: the creation of two tissue-specific models and the detection of reporter metabolites. First of all, adipose and liver tissue datasets were selected, preprocessed in R, and the GIMME algorithm (COBRA toolbox) was applied to create the tissue-specific models. For the second approach, eight datasets from adipose tissue were chosen and differential expression was carried out using the limma package in R. The differentially expressed genes of each dataset in combination with each of the three genome-scale metabolic models, adipocyte, EHMN, and Recon 1, were used as inputs for the reporter metabolites analysis. These newly created tissue-specific models were compared with already published genome-scale metabolic models of the adipocyte and liver. The top 10 differentially expressed genes were shown in tables with their corresponding pathways. For the comparison of the resulting reporter metabolites the KEGG COMPOUND and GLYCAN IDs were added manually to the Recon 1 and adipocyte model. Moreover, the pathways between the top 10 ranked reporter metabolites and the top 10 ranked differentially expressed genes (regarding their p-value) were compared. Differences in the ranking of the reporter metabolites occurred in all comparisons. Nevertheless, good matches could be obtained as well. In addition, accordances between the pathways of the top 10 genes and metabolites could be obtained. Depending on the kind of comparison, the different ranking is caused by several reasons: (i) the differences between the gene expression data; (ii) the three unequal genome-scale metabolic models; (iii) the incompleteness of the IDs of the Recon 1 and adipocyte model (iv) as well as the internal IDs of the EHMN model. In conclusion genome-scale metabolic models contain a lot of biological information, hence they are a powerful tool to study the human metabolism as well as metabolic diseases. Increasing attention is paid to cell- and tissue-specific models to get more precise metabolic models of the human key tissues and cells. ii Zusammenfassung Viele Erkrankungen, wie zum Beispiel Diabetes, sind Auswirkungen von Fehlfunktionen des menschlichen Metabolismus. Derzeit stellt Übergewicht ein weltweit stetig wachsendes Gesundheitsrisiko mit vielen metabolischen Folgeerkrankungen dar. Um die zugrundeliegenden Mechanismen des Metabolismus genauer zu erforschen und mit Hilfe von computergestützten Anwendungen besser zu analysieren und visualisieren, wurden metabolische Netzwerke des Menschen erstellt. Derzeit wurden zwei Modelle des gesamten menschlichen Metabolismus, Edinburgh Human Metabolic Network (EHMN) und Recon 1, sowie mehrere gewebs- und zellspezifische Netzwerke publiziert. Die Zielsetzung der Diplomarbeit ist das Abbilden von Genexpressionsdaten auf genombasierte metabolische Modelle. Für diesen Zweck wurden zwei verschiedene Ansätze ausgearbeitet: die Erstellung zweier gewebs- bzw. zellspezifischer Netzwerke und die Analyse von wichtigen Metaboliten, sogenannten ’reporter metabolites’. Zuerst wurden Datensätze über Fett- und Lebergewebe ausgewählt und in R vorverarbeitet. Die Anwendung des GIMME Algorithmus (COBRA toolbox) lieferte die gewebs- bzw. zellspezifischen Netzwerke zurück. Für den zweiten Ansatz erfolgte die Auswahl von acht Datensätzen über adipöses Gewebe. Die anschließende Genexpressionsanalyse wurde in R unter Anwendung des limma Pakets durchgeführt. Die differenziell expremierten Gene jedes Datensatzes wurden mit jedem der drei genombasierten metabolischen Modelle, EHMN, Recon 1, und das der Fettzelle, kombiniert und als Eingabe für die Analyse der wichtigen Metaboliten verwendet. Die erstellten gewebs- bzw. zellspezifischen Netzwerke, der Fettzelle und der Leber, wurden mit bereits publizierten Netzwerken verglichen. Die Darstellung der zehn bestgereihten differenziell expremierten Gene (bezüglich der p-Werte) mit den dazugehörigen Stoffwechselwegen erfolgte in Form von Tabellen. Um einen Vergleich der resultierenden wichtigen Metaboliten zu ermöglichen, wurden die KEGG COMPOUND und GLYCAN IDs zum Recon 1 und Fettzellen Modell hinzugefügt. Außerdem fand ein Vergleich der Stoffwechselwege zwischen den zehn bestgereihten Genen und Metaboliten statt. Unterschiede zwischen den Reihungen der Metaboliten konnten in allen Vergleichen festgestellt werden. Dennoch traten auch sehr ähnliche Reihungen auf. Weiters konnten Übereinstimmungen zwischen den Stoffwechselwegen der Gene und der Meatboliten beobachtet werden. Abhängig von der Art des Vergleiches hat die unterschiedliche Reihung der Metaboliten verschiedene Gründe: (i) Unterschiede zwischen den Genexpressionsdaten; (ii) die Verwendung von drei verschiedenen genombasierten metabolischen Modellen; (iii) die Unvollständigkeit der IDs des Recon 1 und Fettzellen Modells, (iv) aber auch die internen IDs des EHMN Modells. Zusammenfassend beinhalten genombasierte metabolische Modelle viele biologische Informationen und sind somit sehr gut geeignet um den menschlichen Metabolismus, aber auch metabolische Erkrankungen zu untersuchen. Gewebs- bzw. zellspezifische Modelle erlangen zunehmende Aufmerksamkeit um präzisere Informationen über die wichtigen menschlichen Gewebe und Zellen zu bekommen. iii CONTENTS Contents 1 Introduction 1.1 1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 State of the art 2.1 2.2 2.3 4 Metabolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Metabolic networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.2 Metabolic pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Network Visualization and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1 Network approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.2 Kinetic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.3 Stoichiometric approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Stoichiometric matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Elementary flux mode (EFM) and extreme pathways . . . . . . . . . . . . . . . . 10 Flux balance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Genome-Scale Metabolic Models and the Constraint-based approach . . . . . . . . . . . 13 2.3.1 Genome-Scale Metabolic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.2 Constraint-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.3 Human metabolic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Cell- and tissue-specific models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Systems Biology Markup Language . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3 Methods 3.1 3 17 Toolboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.1 COBRA toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 GIMME algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Reporter metabolites algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.2 TIGER toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.3 OptFlux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.1.4 BioMet toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 iv CONTENTS Reporter Features algorithm 3.2 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 R and Bioconductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Background adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.2 GEOquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.3 Presence/Absence calls from Negative Probesets . . . . . . . . . . . . . . . . . . 27 3.2.4 Linear Models for Microarray Data . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.1 ArrayExpress database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.2 BiGG database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.3 CheBI database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.4 GEO database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.5 Human Metabolome Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.6 KEGG databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4 Results 34 4.1 Comparison of toolboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Creating tissue-specific models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2.1 Expression data and preprocessing in R . . . . . . . . . . . . . . . . . . . . . . . 35 4.2.2 Presence/Absence calls from Negative Probesets . . . . . . . . . . . . . . . . . . 35 4.2.3 Final model generation using the GIMME algorithm . . . . . . . . . . . . . . . . 36 Gene expression data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3.1 Obtaining expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3.2 Calculation of differential expression . . . . . . . . . . . . . . . . . . . . . . . . . 40 Reporter metabolites analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4.1 Reporter Features Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4.2 Adapting human Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.4.3 Comparison of the reporter metabolites . . . . . . . . . . . . . . . . . . . . . . . 45 Overlapping metabolites over all datasets . . . . . . . . . . . . . . . . . . . . . . 45 Overlapping metabolites in each model . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3 4.4 Differential gene expression in adipose tissue from obese human subjects during weight loss and weight maintenance (GSE35411) - Adipocyte model . . 46 5 Discussion 49 List of Figures 53 v CONTENTS List of Tables 55 Bibliography 56 A Results of all selected datasets 64 A.1 Gene expression in adipose tissue during weight loss (GSE11975) . . . . . . . . . . . . . 65 A.1.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Before vs. after energy restriction (ER) . . . . . . . . . . . . . . . . . . . . . . . 65 After energy restriction vs. after weight stabilization (WS) . . . . . . . . . . . . 66 Before dietary intervention vs. after weight stabilization (DI) . . . . . . . . . . . 67 A.1.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Before vs. after energy restriction (ER) . . . . . . . . . . . . . . . . . . . . . . . 68 After energy restriction vs. after weight stabilization (WS) . . . . . . . . . . . . 69 Before dietary intervention vs. after weight stabilization (DI) . . . . . . . . . . . 70 A.1.3 Comparison between expression data . . . . . . . . . . . . . . . . . . . . . . . . . 71 Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 A.2 Expression data from human adipose tissue (GSE15773) . . . . . . . . . . . . . . . . . . 76 A.2.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Insulin resistant vs. insulin sensitive omental tissue . . . . . . . . . . . . . . . . . 76 Insulin resistant vs. insulin sensitive subcutaneous tissue . . . . . . . . . . . . . 77 A.2.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Insulin resistant vs. insulin sensitive omental tissue . . . . . . . . . . . . . . . . . 78 Insulin resistant vs. insulin sensitive subcutaneous tissue . . . . . . . . . . . . . 79 A.2.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 A.3 Genome-wide analysis of adipose tissue gene expression in twin-pairs discordant for physical activity for over 30 years (GSE20536) . . . . . . . . . . . . . . . . . . . . . . . . . . 83 A.3.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Active vs. non-active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 A.3.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Active vs. non-active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 A.4 Differences in subcutaneous adipose tissue gene expression between obese African Americans and Hispanic Youths (GSE23506) . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 A.4.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 vi CONTENTS African Americans vs. Hispanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 A.4.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 African Americans vs. Hispanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 A.5 Subcutaneous adipose tissue: comparison of weight maintenance and weight regain following an 8-week low calorie diet (GSE24432) . . . . . . . . . . . . . . . . . . . . . . . . 89 A.5.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Weight maintenance - before low calorie diet vs. after low calorie diet . . . . . . 89 Weight regainer - before low calorie diet vs. after low calorie diet . . . . . . . . . 90 A.5.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Weight maintenance - before low calorie diet vs. after low calorie diet . . . . . . 92 Weight regainer - before low calorie diet vs. after low calorie diet . . . . . . . . . 93 A.5.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 A.6 Characterization of the initial molecular events of adipose tissue development and growth during overfeeding in humans (GSE28005) . . . . . . . . . . . . . . . . . . . . . . . . . . 98 A.6.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Day 0 vs. day 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Day 0 vs. day 56 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 A.6.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Day 0 vs. day 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Day 0 vs. day 56 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 A.6.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A.7 Hypoxia-induced modulation of gene expression in human adipocytes (GSE34007) . . . 107 A.7.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Normoxic vs. hypoxic conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.7.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Normoxic vs. hypoxic conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 A.8 Differential gene expression in adipose tissue from obese human subjects during weight loss and weight maintenance (GSE35411) . . . . . . . . . . . . . . . . . . . . . . . . . . 109 A.8.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Baseline vs. after weight reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Baseline vs. after weight maintenance phase . . . . . . . . . . . . . . . . . . . . . 111 vii CONTENTS A.8.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Baseline vs. after weight reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Baseline vs. after weight maintenance phase . . . . . . . . . . . . . . . . . . . . . 113 A.8.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Appendix - List of Tables 117 Acknowledgement 124 Statutory declaration/Eidesstattliche Erklärung 125 viii CHAPTER 1. INTRODUCTION Chapter 1 Introduction Metabolism is the core cellular function [1] and describes a set of reactions, which includes degradation, build and interaction of macromolecules [2]. A large number of those metabolic genes and enzymes have been studied individually for a long time to get all information resulting in the existing knowledge base, called bibliome. Bibliome describes the interactions and reactions of metabolic genes and enzymes [3]. However, bibliomic data are not enough to analyse the molecular activity within a cell. One way is to build models that represent the interactions among all components, where genome-scale in silico models represent a powerful method [4]. Building such a model includes on the one hand a bottom-up reconstruction by using genomic experimental and bibliomic data, and on the other hand an iterative improvement [3][5][6]. The first genome-scale in silico metabolic model for eukaryotic cells was the reconstruction of Saccharomyces cerevisiae, which led to a better understanding of the eukaryotic cellular behaviour [7]. So far two different human metabolic networks have been reconstructed: the Edinburgh Human Metabolic Network (EHMN) by Ma et al. [8] and the Human Recon 1 by Duarte et al. [3]. These advancements lead to new opportunities in research of diseases and drug development. Due to the fact that metabolism is affected by genetics, environmental and nutritional impacts, the appearance of malfunctions is a main contributor to human diseases [3][5]. By taking a quick survey the Online Mendelian Inheritance in Man (OMIM) shows, that 23% of metabolic genes are disease related and that 48% of metabolic reactions are influenced by those disease related genes. A lot of diseases can be classified as metabolic diseases, like cardiovascular illnesses, cancer and diabetes [9]. For instance, it is a known fact that during tumor development, cancer cells modify the human metabolism [10] and that there is an existing association between the central carbon metabolism and cancer development [1]. As mentioned, genome-scale in silico models are used to predict and develop new drug targets [1][10]. The research of interactions between drugs and the metabolic system offers a new perspective for 1 CHAPTER 1. INTRODUCTION discovering drug targets [9] making them more personalized and more adapted to a specific disease. Consequently a better treatment with drugs will be possible. In future that genome-scale in silico models are useful for modeling metabolic pathways, but moreover it represents a powerful tool for researching diseases in order to make earlier diagnosis of illnesses and in consequence to start treatment earlier [4]. Furthermore, it is also relevant to develop more efficient anticancer drugs [10] to supply a better treatment for patients. Due to the actuality of the fact, that obesity is worldwide a growing health problem and moreover, provokes metabolic diseases, the focus for this thesis lied on adipose tissue datasets. The number of obese people has doubled since 1980 and today obese is ranked fifth of the leading risks for global death. The prevalence for overweight and obesity is increasing worldwide, in high- and middle-income countries as well as in low-income countries. Moreover, it is a matter of fact that 65% of the worldwide population is living in countries, where more deaths are caused by overweight and obesity than by underweight and undernourishment [11]. The main reason for overweight and obesity is the combination of an increased intake of energy and a decrease in physical activity [11]. The Body-Mass-Index (BMI) is widely used to classify people as overweight or obese and is calculated by the weight of a person (in kg) divided by the square of the height (in meters). The resulting number defines people as normal (18.5 - 25), underweight (under 18.5), overweight (25 - 30), or obese (above 30) [12]. Amongst other obesity is associated with several health consequences, such as noncommunicable diseases [11]: - Cardiovascular disease - Type 2 diabetes - Musculoskeletal disorders - Special kinds of cancer, such as breast, colon and endometrial cancer - Insulin resistance - Hypertension At cellular level, obesity causes a modification and dysfunctions in adipose tissue including macrophage infiltration, inflammation, and fibrosis [13]. Moreover, it introduces changes in the fatty acid metabolism. Leading to an increased fatty acid flux in adipose tissue and furthermore to metabolic dysfunctions in liver and skeletal metabolism [14]. 2 CHAPTER 1. INTRODUCTION Up to now it is known that adipose tissue plays an important role in multiple human metabolic pathways. The inflammatory pathways and the macrophage infiltration in the adipose tissue are interconnected with obesity [15]. Due to the multitude of associated disease, a lot of scientific work has to be done to obtain more knowledge about obesity and resulting health problems. 1.1 Objectives The goal of the thesis is mapping gene expression data onto genome-scale metabolic models in order to explore pathophysiological modifications in adipose tissue. Specific aims of the thesis are: 1. Construction of two human tissue-specific models for liver and adipocyte 2. Collection and preprocessing of gene expression data from human adipose tissue 3. Adaption of human metabolic models 4. Illustration of the differences between models and datasets 3 CHAPTER 2. STATE OF THE ART Chapter 2 State of the art 2.1 Metabolism All cells of living organisms, regardless of whether prokaryotes or eukaryotes, possess their individual metabolism. Metabolism itself is a highly organized process and consists of many interconnected chemical reactions, which are catalyzed by enzymes controlling the generation and decomposition of macromolecules [2]. For instance, these chemical reactions regulate the following processes: the generation of membranes, replication and repair of DNA, development of new cells, transport processes as well as the transformation of one molecule to another [2][16][17]. These are only a few examples for metabolic processes, but they show the importance of metabolism for vital functions [1][3][18]. In addition to mentioned processes, metabolism is responsible for producing energy and generating components for the organism [2] leading to the two main categories of chemical reactions: catabolism and anabolism. Catabolism decomposes macromolecules and yields energy, while anabolism uses energy to build new cellular components [18][19]. 2.1.1 Metabolic networks The procedures of relating reactions can be illustrated in so-called metabolic networks, which is a connection of many pathways [20]. These maps are usually very complex and seem at first sight incomprehensible, but it is important to give the attention to metabolic pathways to draw conclusions [18]. 4 CHAPTER 2. STATE OF THE ART 2.1.2 Metabolic pathways A metabolic pathway is a stepwise proceeding [18] and can be generally categorized into a chain of reactions [2]. As mentioned before, there are two categories of reactions. This leads to two kinds of metabolic pathways, catabolic and anabolic pathways [18][19]. Each pathway has one entry point and may have several exit points [20]. Metabolic pathways are often studied in lower organisms, because they are less complex and are easier to characterize. Subsequently, one can compare these pathways with the ones of higher organisms to draw conclusions about the similarity of functions [18][19]. 2.2 Network Visualization and Analysis The representation and analysis of metabolic networks can be divided into three main approaches: network approach, stoichiometric approach, and kinetic approach. The result of each method uses another level of detail and is in consequence dependent on different in input informations [21]. All three mentioned approaches are shown in figure 2.1, where the triangles represent two main features: size of the system and level of detail. That implies that the network approach is well suited for dealing with a huge size of the system, but offers only a small level of detail, while the kinetic approach handles with a high level of detail, but only with a small part of the system. The third approach, the stoichiometric approach, is a compromise of these approaches, with a medium size and a medium input level of detail. As a general rule it can be started that, in order to get precise information and increase the significance of results, the size of the system needs to be reduced and more detailed information is needed. Using the network approach only qualitative predictions are possible, whereas the stoichiometric and kinetic approaches allow quantitative predictions [21][22]. Figure 2.1: Illustration of the different approaches for visualizing and analysing metabolic networks taken from [22]. 5 CHAPTER 2. STATE OF THE ART 2.2.1 Network approach Networks are used to represent ’real-world’ objects and the networks in systems biology follow mainly these four aims [23]: - Visualisation of complex biological structures - Interpretation of the network as a model, based on the used mathematical approach - Representation of the network as a data structure to extract biological information - Promoting a more comprehensive knowledge of biological structures and processes by generating new networks Graphs are basically directed or undirected. A simple network consists of nodes (vertices), representing for example metabolites, and edges, representing the interactions between metabolites. Networks can be assigned to the following networks classes [23]: - Regular networks: Each node in this network has the same degree meaning that for example each node is connected to three other nodes by three edges. - Random networks: A random graph is characterized by the probability, which describes the likelihood of an edge between two nodes. - Directed acyclic graphs: A directed acyclic graphs is a special kind of a directed graph, which does not contain direct cycles. For example ontologies within bioinformatics or systems biology are visualized using directed acyclic graphs. - Trees: A tree is a connected acyclic graph and two nodes are connected by only one path. - Generalized Trees: An extension of the network class trees are generalized trees. In this representation are nodes, edges and levels. Each node is referred to a level. In comparison to trees, generalized trees allow edges between two nodes within one level and between nodes, where one or more levels are skipped. Consequentially, there are more paths available to reach a node. - Small-world networks: Small-world networks are characterized by a short path length and a high clustering coefficient. This implies that one node can be reached from another node by using only a small number of edges. - Scale-free networks: Scale-free networks are observed in ’real-world’ networks. The degree distribution follows the power-law, meaning that a newly added node is connected to already existing nodes by a certain probability. 6 CHAPTER 2. STATE OF THE ART For systems biology gene networks are interesting, because their nodes represent genes or gene products and the edges molecular interactions. One type of gene networks are metabolic networks [23] and they can be constructed as a bipartite graph with directed or undirected edges as shown in figure 2.2. Bipartite means that there are two kinds of vertices, one set is for the representation of the reactions and the other one for the representation of the metabolites [21][22][24]. Because of the two sets of nodes such a graph can be separated into two subgraphs, one including the reactions and the other including the metabolites [21][22]. The edges of this graph lead from the vertices of a metabolite to the vertices of a reaction and reciprocal. This means that a metabolic node, which represents a compound that should be catabolized, leads through the edge to a reaction node, which represents the enzyme, and then to a metabolic node again, which represents a product [24]. Figure 2.2: In this figure a bipartite graph is represented. The circles represent the metabolite vertices and the rectangles the reaction vertices. The figure is redrawn from [24]. For analysing such a metabolic network, different network descriptors for undirected or directed graphs, are applicable to characterize structural properties [23]. For instance, connectivity, clustering coefficient and various centrality measures are especially useful for undirected graphs. In contrast stoichiometric properties and chokepoints are descriptors for directed metabolic graphs [24]. Topological properties can be calculated and interpreted as shown in the following examples: Metabolic networks have a notable short average path length, which indicates a small-world property. This measurement can be used to calculate properties such as [21][22]: - The time for spreading information within the network - The damage, which is caused by deletions of enzymes - The differences between two compared organisms - The hierarchical structure of the metabolic networks 7 CHAPTER 2. STATE OF THE ART Examples show, that the degree distribution for metabolic graphs is very heterogeneous indicating a scale-free network, where a few metabolites are strongly connected and take part in a lot of reactions. These metabolites can be referred as hubs. Furthermore, the degree distribution is an indicator of the robustness and error tolerance, for example the deletion of nodes or edges, of a metabolic network. Another approach for identifying important components of a network are centrality measurements [21][23]. The network approach has two main advantages: it is practicable for large-scale metabolic models and only topological information is needed. There is no necessity for kinetic parameters, which allows the analysis of less known organisms. Drawbacks are that metabolic graphs are bipartite graphs, which requires a sophisticated analysis of such networks. Moreover, it is not possible to analyse dynamic properties of the network and only qualitative descriptions can be calculated [22]. 2.2.2 Kinetic approach The kinetic approach, which is a quantitative evaluation of the properties of a metabolic network [22], includes only single molecules and their interaction [2]. Therefore, this approach is limited to small systems or pathways [22], whereas kinetic representation and analysis is the most detailed of the three mentioned approaches [21]. The metabolic processes are described with the help of mass-balance equations or in terms of ordinary differential equations [21][22]. The following figure 2.3 shows an example [22]: Figure 2.3: Example of a minimal model of glycolysis to illustrate the kinetic approach [22]. A is the reaction scheme and shows a graphical presentation of a minimal model of glycolysis. It shows that one unit of glucose (G) is converted by reactions into two units of pyruvate (P ). B shows the stoichiometric matrix N , which includes the information of the metabolites in their rows and the information about the reactions in the columns. Gx , Px , and Glx represent external metabolites, which are not in the stoichiometric matrix. C represents the reaction list of the model and D the dynamic mass-balance equation or system of differential equations [21][22]. 8 CHAPTER 2. STATE OF THE ART As already mentioned, the advantage of this approach is that it returns very detailed results and a quantitative prediction of the dynamic behaviour. However, detailed informations about enzymekinetic rate functions and kinetic parameters are required, but there is a lack of information about these parameters and moreover, it is difficult to find reliable kinetic parameters even if they exist [21][22]. 2.2.3 Stoichiometric approach The stoichiometric approach, which is based on a mathematical representation of the network by using a stoichiometric matrix [2], takes structural properties as well as the constraints into account [22]. The stoichiometric approach is independent from kinetic information and the needed stoichiometric information about a network is often available [21][22]. Moreover, it is also computable for largescale networks, more predictive than the network approach and the calculation returns quantitative predictions, which are more suitable when analysing metabolic models. However, stoichiometric analysis takes only steady state assumptions into account and therefore does not allow drawing of dynamic conclusions. Furthermore, due to missing dynamic properties, predictions about allosteric regulations are not possible [22]. Stoichiometric matrix The stoichiometric matrix includes information about the structure of the network [2] and is essential for predicting network functions [22]. The included information describes the connections and interactions between molecules [22] and implying the amount and kind of molecules, which are consumed and produced, can be interpreted [21]. The stoichiometric matrix is used to calculate the possible fluxes within a network at steady state. The matrix itself is a m × n matrix, whereby m represent the metabolites (rows) and n the reactions (columns) [2][22]. As the number of metabolites is usually smaller than the number of reactions and the system of equation has generally no unique solution [21]. Following the linear equation 2.1 in steady state [2][22]: dS(t) = 0 ⇒ N v(S, k) = 0 dt m ... metabolites n ... reactions S ... m-dimensional time dependent vector of the metabolic concentrations N ... m × n stoichiometric matrix v ... n-dimensional vector of rate equations k ... set of parameters 9 (2.1) CHAPTER 2. STATE OF THE ART Elementary flux mode (EFM) and extreme pathways In this approach the stoichiometry is used to find all possible routes from one metabolite to another [2] through a metabolic network. All these pathways, which are working together in a steady state condition, are represented in flux modes. Moreover EFMs are unique for the network [22]. These elementary flux modes are a minimal set of flux vectors meaning that they cannot be further simplified. An example is shown in figure 2.4. Existing software solutions allow the computation of elementary flux modes, but because of overlapping pathways, which lead to exhaustive enumeration, it is limited to medium sized metabolic models. Very similar to elementary flux modes are extreme pathways. The difference is that extreme pathways allow in contrast to EFM irreversible reactions [2][22]. Figure 2.4: The figure depicts two similar reaction networks and the corresponding elementary flux modes. It is taken from [2]. Flux balance analysis Flux balance analysis (FBA) [25] is a widely used and very popular approach to analyse large scale metabolic networks [22]. It uses linear optimization to calculate the steady state flow through the network to predict the optimal performance of a special organism or the optimal production rate of a metabolite [25][26]. Flux balance analysis can be divided into of five steps, as shown in the figure 2.5 of Orth et al. [25]. The first step is the definition of reactions and followed by the definition of the stoichiometric coefficients for each reaction to build a stoichiometric matrix N [25][26]. In addition constraints as well as lower and upper bounds are defined. 10 CHAPTER 2. STATE OF THE ART The constraints set by FBA are defined as follows [2][27]: 1. The assumption of steady state: N v(S, k) = 0 2. The irreversibility of reactions: v > 0 3. The limited capacity of enzymes to convert metabolites: v ≤ max Figure 2.5: The figure illustrates the five steps of flux balance analysis [25]. In addition, FBA allows the definition of additional constraints belonging to one of four groups [26][28]: 1. Physico-chemical constraints, like reaction rate 2. Spatial or topological constraints, like growth of molecules inside the cell 3. Condition dependent environmental constraints, like temperature or nutrient availability 4. Regulatory constraints, like transcriptional and translational regulation or enzyme regulation 11 CHAPTER 2. STATE OF THE ART Another important aspect is the definition of upper and lower bounds for each reaction as they define the maximal respectively minimal allowable flux [25][26]. The third step is the set up of linear equations which may yield (more than one solution) due to the higher number of reactions compared to metabolites [21]. The fourth step is setting the objective function and further calculate these fluxes, which maximize or minimize the objective function within the solution space [25]. This is a critical step, because it influences the goal of the study [26], by defining how much each reaction contributes to the result [25]. There are different kinds of objective functions, where the most common objective function is to maximize the cell growth or biomass [26][27]. Other objective functions are for example, minimization of ATP production, minimization of nutrient uptake, maximization of metabolite production, maximization of biomass and metabolite production or optimal metabolite channeling [26]. The objective function is defined in the following equation 2.2 [27]: Z= r X ci vi (2.2) i=1 Z ... objective function ci ... weights for each flux vi ... flux The last step is to combine the mathematical representation of metabolic reactions and the objective function to solve the system of linear equations by using linear programming [25][27]. The advantages of this approach are, that no kinetic parameters are needed and that the calculation is quickly even for large-scale metabolic models. The drawback is that, due to the lack of kinetic parameters, the results are less detailed. Another disadvantage is that there are only steady state fluxes possible which limits the predictive significance [22][25]. Additionally, there are also four other situations which cause problems. The first is the existence of parallel metabolic routes, which means that two enzymes catabolize the same reaction, and reversible routes. In these cases the optimization functions takes only the flux of both routes into account and it is not possible to handle the fluxes separately. A similar problem are cyclic fluxes as they cannot be resolved either, since they have no influence on the fluxes of a network [22]. Futile cycles are the last problematic case [22], which occur if two opposite reactions are catalyzed by different enzymes at the same time, because then the resulting energy disappears [29]. These cycles are normally not respected by using optimization criterions, because they have no optimal solution and so they are not included in the results of flux balance analysis, even if they are very common in many organisms [22]. 12 CHAPTER 2. STATE OF THE ART 2.3 Genome-Scale Metabolic Models and the Constraint-based approach The stoichiometric approach is currently the most suitable approach for analysing metabolic networks [21][22], which encouraged the reconstruction of genome-scale metabolic models. 2.3.1 Genome-Scale Metabolic Models The first genome-scale metabolic models of viruses were constructed in the early 1990s [30]. A genomescale metabolic model describes the relationships of a metabolic network on a genotype-phenotype level [6]. Biological network models are mostly bottom-up reconstructions, which means a component by component construction, of genomic and bibliomic data. This leads to a biochemically, genetically and genomically structured reconstruction (BiGG) of a tissue specific or non-tissue specific metabolic network [5][31]. Existing genome-scale models are available in all three domains of life: archaea (single-cell microorganisms), bacteria (prokaryotic microorganism) and eukarya. The most studied genome-scale models [32] are the one of Escherichia coli (bacteria) [33][34] and Saccharomyces cerevisiae (eukarya) [7]. The generation of a genome-scale metabolic model consists of four main steps [35][36]: The first step includes collecting all the biological components, which are relevant for the reconstruction, and the necessary information about them [32][35][36]. Next, the metabolic network is reconstructed by generating a metabolic reaction list that connects selected components and the construction of the gene-protein reaction relationships defining the proteincomplex or protein that catalyzes a reaction [32][35]. The third step deals with the transformation of the reconstructed model into a mathematical representation [32][35][36]. The resulting stoichiometric matrix N is a m × n matrix, where m are the number of metabolites and n are the number of reactions [32]. Each column of the matrix complies to a reaction and every row to a metabolite. Positive numbers represent products and negative numbers substrates [35]. This matrix can be used for computational calculations. In the final step the network is evaluated. By comparing simulation results with published experimental data [35][36] it ispossible to find mistakes, like missing metabolic functions and wrong assignments of reversibility [35]. Usually, such a construction of a network is a iterative process and needs several iterations to get the final result [7][32][35]. Therefore, it is very labor and time intensive until a model is finished [35]. Generally, such a genome-scale metabolic models represent a BiGG knowledgebase and a mathematical model (in silico) to enable constraint-based analysis. 13 CHAPTER 2. STATE OF THE ART 2.3.2 Constraint-based approach For studying genome-scale metabolic models, different mathematical approaches are possible. As already mentioned, many of these approaches need a lot of detailed kinetic parameters, resulting in a lack of information and consequently in limited approaches [31][32][37]. Moreover, the aim of many of these approaches is the prediction of the detailed network functionalities. On the contrary, the constraint-based approach (CBM), is data driven using a mathematical model like the genome-scale metabolic one. The goal of this approach is finding those network states that can be achieved and simultaneously excluding all others [32]. It predicts properties of the network in silico [4] and assumes a steady state [37], so the flow of metabolites through the network can be observed [38]. Using this, it is possible to identify gaps, those reactions, which don’t carry a flux [3][38]. By the use of constraint-based modeling, the research of tissue-specific metabolic behaviour is possible [39], as well as the identification of phenotypes of microorganisms, like growth rate, uptake of nutrients, product secretion and the outcome of gene deletions [35][39]. The advantage is, that only physical-chemical and environmental constraints, like mass, energy, charge, reaction fluxes or thermodynamics [4][31], are used [4][39][40]. 2.3.3 Human metabolic models So far two global human metabolic models were published in 2007 and those are widely used for systematic studies: the homo sapiens Recon 1 model [3] and the Edinburgh Human Metabolic Network (EHMN) [8][41]. Both models consist of compartmentalized metabolites, whereas EHMN includes these compartments since the second published version in 2010 [41]. The following table 2.1 shows properties of both models [3][41]: Model Reactions Metabolites Genes Compartments EHMN 6216 6522 2322 8 Recon1 3743 2766 1496 8 Table 2.1: Comparison of the human metabolic models. The following figure 2.6 illustrates the four major applications of global human metabolic models [42]: 1. Gene expression data can be used for the reconstruction of cell- and tissue-specific models 2. Reconstruction of similar mammalian models 3. Interpretation of gene expression data by mapping them onto global human metabolic networks 4. Simulation of pathological and drug states 14 CHAPTER 2. STATE OF THE ART Figure 2.6: The four major applications of global human metabolic models [42]. Cell- and tissue-specific models Human metabolism cannot only be seen from the global point of view, moreover it is also important to generate cell- and tissue-specific models to represent the metabolism by taking into account tissuespecific information [42][43]. The development of three different algorithms for developing cell- and tissue-specific models accelerates the reconstruction of new cell- and tissue-specific models [43]. For instance following cell- and tissue-specific models are already reconstructed: - human liver [37][44] - alveolar macrophage [43] - kidney [45] - adipocyte [48] - brain [46] - myocyte [48] - erythrocyte [47] - hepatocyte [48] Systems Biology Markup Language Systems Biology Markup Language (SBML) [49] is a format based on the Extensible Markup Language (XML) for describing biological networks and processes, as pathways. Each component of the model is defined in a specific list, whereas the definition of a component is optional. All lists are independent, but, depending on the model complexity, dependencies among them could exist. SBML Level 3 Version 1 is the recent release, whereby ’level’ defines the edition and ’version’ small updates within a specific release. SBML is a computer-readable language, hence software packages can translate SBML models into internal models and vice versa [49]. 15 CHAPTER 2. STATE OF THE ART The following figure 2.7 of Gianchandani et al. [50] illustrates metabolic network reconstructions and analysis as an iterative workflow and simultaneously points out the connections between the last chapters. Figure 2.7: The reconstruction of a metabolic reaction network is done with data from literature and gene-protein-reaction (GPR) relationships from experimental data. This information is converted into a stoichiometric matrix for the following simulation step, which is an iterative process. Thereby flux balance analysis is used to calculate the steady-state fluxes through the network using constraints. The results are analysed and validated with the help of different methods. Subsequently these outcomes are date of, for example, published literature or online databases, which might be used as new input for metabolic network reconstructions [50]. 16 CHAPTER 3. METHODS Chapter 3 Methods 3.1 Toolboxes 3.1.1 COBRA toolbox The constraint-based reconstruction and analysis toolbox (COBRA) is a MATLAB package for analysis, prediction, and simulation of phenotypes. It uses bottom-up constructed genome-scale metabolic models, which are stored in Systems Biology Markup Language (SBML) file format and can be imported into MATLAB by converting them to a COBRA model. The MATLAB model consists of different fields, which include amongst others: - rxns: list of all reaction abbreviations - mets: list of all metabolite abbreviations - S: stoichiometric matrix - rev: defines if reactions are reversible or not - lb/ub: lower/upper bounds of reactions - c: objective coefficients - genes: list of all genes (optional) - rxnGeneMat: reaction-gene matrix (optional) - rxnNames: list of all reaction names (optional) - metNames: list of all metabolite names (optional) - metChEBIID/metKEGGID/metPubChemID/metInChIString: one list for each metabolite ID (optional) 17 CHAPTER 3. METHODS The COBRA toolbox offers a wide range of different methods, and allows community members to provide new add-ons. Figure 3.1 displays an overview of the COBRA toolbox, including seven categories of COBRA methods and additional functionalities for reading and writing models, for testing the toolbox, and for integrating different solver functionalities [5]: Figure 3.1: Overview of the COBRA toolbox functionalities [5]. Most COBRA methods follow the constraint-based approach by returning a reduced set of solutions, but no unique one. That requires the use of different constraints and metabolic objectives to calculate possible network states under a defined set of conditions [5]. GIMME algorithm The Gene Inactivity Moderated by Metabolism and Expression (GIMME) [51] algorithm offers the possibility to algorithmically create a tissue-specific model out of genome-scale metabolic models and expression data [42][51]. GIMME algorithm requires three different inputs as tab-separated text file: - gene expression data - a genome-scale reconstructed network - one or more Required Metabolic Functionalities (RMF) defining the new model 18 CHAPTER 3. METHODS The algorithm follows a two step procedure: the first step is the execution of FBA to calculate the maximal possible fluxes through all RMFs. For the second step constraints of the RMFs are defined to be at or above a minimum level, for instance a percentage of the maximum that is found in FBA. This cutoff value for defining reactions as active or inactive is set by the user. However, it could occur that a reaction is classified as inactive even if it is necessary to achieve the RMFs. To prevent this problem the following linear optimization (formula 3.1) is used to find the most consistent set of reactions and consequently to reactivate misclassified ones [51]. Minimize: X ci · |vi | Subject to: S · v = 0 ai < vi < bi (3.1) where ci = xcutof f − xi where xcutof f > xi 0 otherwise for all i ci ... constraint vi ... flux vector S ... stoichiometric matrix xcutof f ... cutoff value xi ... normalized gene expression data mapped onto each reaction ai , bi .. lower and upper bound of each reaction by taking into account the RMFs The result of the GIMME algorithm is a reduced network with a minimal inconsistency score (IS), which describes the disagreement between expression data and the objective function. To enable a more intuitive interpretation, the IS values are converted to a normalized consistency score (NCS), which characterize those gene expression data that fit to the objective function [51]. Reporter metabolites algorithm The reporter metabolites algorithm by Patil and Nielsen [52] identifies metabolites having an important function in metabolic regulation and highly correlated subnetworks. For this purpose gene expression data are mapped onto genome-scale metabolic models to determine the so-called reporter metabolites. 19 CHAPTER 3. METHODS The following figure 3.2 shows the step-wise procedure of the reporter metabolite algorithm: Figure 3.2: Illustration of the step-wise procedure of the reporter metabolites algorithm [52]. Starting point is a genome-scale metabolic model and subsequently two new networks are derived, a metabolic network and an enzyme interaction network. The metabolic network is a bipartite undirected graph with metabolites and enzymes represented as nodes and their interactions illustrated as edges. One metabolite is involved in one or more reactions and is consequently connected to all enzymes catalyzing this reactions. The enzyme interaction network is a unipartite graph, where enzymes are nodes and metabolites are edges, meaning enzymes are connected that share a metabolite in a reaction. The next step is mapping transcriptional data onto the enzyme nodes of both graphs. Two kinds of transcriptional data can be used: differential data (e.g. the comparison of two different conditions) and multidimensional data (e.g. the comparison of multiple conditions). Differential data are mapped onto the enzyme nodes using student’s t-test to calculate p-values as result, where each p-value represents the significance of the change of an enzyme. For multidimensional data the absolute pearson correlation coefficient P is calculated for each edge between nodes. Both p-values and P -values follow a uniform distribution and are therefore converted to Z scores by inverse normal cumulative distribution, called normalized transcriptional response [52]. Reporter metabolites are finally identified by scoring each metabolite by the normalized transcriptional response of its neighbour enzymes as illustrated in formula 3.2 [52]: 1 X Zmetabolite = √ Zni/ej k 20 (3.2) CHAPTER 3. METHODS Afterwards Zmetabolite scores are corrected for the background distribution. This scoring system defines those metabolites with the highest score as reporter metabolites [52]. The last step is the identification of highly correlated subnetworks within an enzyme interaction network. However, this is a nondeterministic-polynomial-hard-problem, called clique problem, which describes the problem of finding a specific subgraph [52][53]. The reporter metabolites algorithm uses simulate annealing as a heuristic approach to find a solution. However, there remain two difficulties: (i) simulated annealing may not only return global optimal solutions, but also local ones; (ii) the resulting subnetwork is depending of the initial conditions and parameters. To overcome this problems simulated annealing algorithm is repeated ten times and the subnetwork with the highest score is selected [52]. 3.1.2 TIGER toolbox The Toolbox for Integrating Genome-scale Metabolism, Expression and Regulation (TIGER) [54] is a MATLAB package that tries to improve three deficiencies of already existing toolboxes, like COBRA toolbox [5], CellNetAnalyzer [55], and the BioMetToolbox [56]: - Converting COBRA models and transcriptional regulatory networks (TRNs) into integrated optimization problems - Integration of high-throughput expression data to analyse these integrated models by using existing algorithms - Offering user the possibility for developing new algorithms based on these integrated models The TIGER toolbox is compatible with COBRA models (figure 3.3): Figure 3.3: The figure illustrates the conversion of a COBRA model into a TIGER model [54]. 21 CHAPTER 3. METHODS The conversion process allows adding boolean constraints, which are derived from Gene-Protein-Reactions (GPR). These GPRs are defined in boolean logic and describe the relationships between genes, genes and their protein products and reactions. These boolean rules are step-wise converted to systems of inequalities, then upper and lower bounds are added and the result is a mixed integer linear program (MILP). The last step of converting a COBRA model into a TIGER model is mapping the converted rules onto the COBRA model [54]. This TIGER model can be used for developing new algorithms, applying functionalities as such flux balance analysis, and creating context-specific networks using GIMME [51], iMat [39] or MADE (Metabolic Adjustment by Differential Expression) algorithms [57]. 3.1.3 OptFlux OptFlux is a open-source software platform with the aim of providing a user friendly computational tool for metabolic engineering applications. Metabolic engineering means optimizing the processes within an organism to increase the production of a certain compound. In comparison to the COBRA and TIGER toolboxes, OptFlux is a Java based modular program and provides a Graphical User Interface (GUI) to enable a user friendly environment even for users with little knowledge in the research area [58]. OptFlux provides a series of functionalities, which can be classified into four categories: 1. Model Handling: allows users to read models either as flat text files, from text files that follow the Metatool format [59], or models using SBML standard. 2. Simulation module: offers methods for metabolic phenotype simulations. It includes different methods such as FBA [25], Minimization of Metabolic Adjustment (MOMA) [60], Regulatory on/off minimization of metabolic flux changes (ROOM) [61] and Metabolic Flux Analysis (MFA). 3. Optimization: the aim of those methods is the optimization of the objective function by identifying sets of reactions or genes , which have to be deleted for reaching the optimum. Implemented methods for the optimization are OptKnock [62] and OptGene [63]. 4. Pathway Analysis: provides the EFMTool [64], for elementary flux modes analysis, and the possibility to export a flux to Cell Designer [65]. 22 CHAPTER 3. METHODS 3.1.4 BioMet toolbox BioMet toolbox [56] is a web-based toolbox offering three analysis tools: - Reporter Features - Reporter Subnetworks - BioOpt The purpose of the Reporter Features algorithm [66] is the identification of transcriptional regulatory circuits in a metabolic network and is similar to reporter metabolites algorithm of Patil and Nielsen [52] explained in chapter 3.1.1. Reporter Subnetworks is a derivation of the reporter metabolites algorithm and identifies significant subnetworks as described detailed in chapter 3.1.1. Both tools, Reporter Features and Reporter Subnetworks, use high-throughput data and genome-scale metabolic models for predicting metabolic behaviours. The third tool, BioOpt, is a tool for conducting flux balance analysis [56][66]. Reporter Features algorithm The Reporter Features algorithm is a hypothesis driven algorithm for mapping gene expression data onto genome-scale metabolic networks for identifying groups of neighbour genes, which are significantly co-regulated in comparison to the others. The algorithm is a generalization and extension of the reporter metabolites algorithm of Patil and Nielsen [66]. The algorithm (figure 3.4) needs three kinds of input data as tab-separated text files: gene expression data, an interaction or annotation list, and a genome-scale metabolic network. 23 CHAPTER 3. METHODS Figure 3.4: Illustration of the step-wise procedure of the Reporter Features algorithm [52]. The interaction or annotation list may contain Protein-DNA interactions, Protein-Protein interactions, or GO annotations and can be represented as a bipartite graph. The algorithm considers genes and metabolites as nodes and their interactions are illustrated using edges. One metabolite is involved into one or more reactions, that are catalysed by enzymes [56][66]. Gene expression data contain a list of genes with their be p-values of pairwise comparisons or their pearson correlation coefficients P in case of multidimensional data. In both cases inverse normal cumulative distribution is used to convert the values into Z scores as they follow a normal standard distribution [66]. The scoring system for scoring and ranking the features is based on the distribution of means of random groups of the same size and is a test for the null hypothesis. The score of one metabolites depends on the scores of the neighbours, because their Z values are summed up and divided by the number of neighbours (formula 3.3) contrary to reporter metabolites (formula 3.2), where the summed up values are divided by the root of the number of neighbours [52][66]. Zfeature j N 1 X = Zni/ejk N (3.3) K=1 The resulting Z score is corrected by subtracting the mean and dividing it by the standard deviation. These Z scores are converted back to p-values by normal cumulative distribution, because the user decides the significant p-value to define metabolites as reporter metabolites [52][66]. 24 CHAPTER 3. METHODS However, it is also possible to choose the option of higher-degree Reporters. The illustrated scoring system is for first-degree Reporters. The result of Reporter Features algorithm offers the possibility to look separately at up- and downregulated, only up-regulated or only down-regulated Reporter Features [66]. 3.2 R and Bioconductor R and the package Bioconductor offer data structures and functions for importing and processing microarray data. One-colour microarrays include one set of probe levels per microarray whereas twocolour microarrays produce two sets of probe-levels (red and green) on each microarray. Amongst others microarray data can be imported as CEL files or in the simple omnibus file format (SOFT). The use of CEL files requires preprocessing steps before data can be used for further analysis. In the following section three R packages are described more detailed [67]. Affymetrix GeneChip arrays (one-colour microarrays) are used for high-throughput gene expression analysis [68]. An Affymetrix GeneChip contains short oligonucleotide, with a size of 25bp per gene. Because of the small size multiple oligonucleotide probes, usually between eleven and twenty probe pairs building one probeset for each gene, are used to increase specificity. Probe pairs contain one perfect match (PM) strand, for specific hybridization, and one mismatch (MM) strand, for non-specific hybridization. The non-specific hybridization is caused by integrating a non-specific component, which is constucted by exchanging the thirteenth nucleobase with the complementary nucleobase [67]. 3.2.1 Preprocessing Preprocessing consists of three steps: background adjustment, normalization, and summarization. For each step a wide range of methods are available, where three methods are commonly used. Background adjustment Background adjustment is an essential step as it has the largest influence on accuracy and precision [69]. The aim is to increase the array intensity by adjusting intensity reading of non-specific signals [70]. The default adjustment, provided as part of the Affymetrix system, can be described as difference between PM and MM probe intensities [71][72]. MAS5 (Affymetrix Microarray Suite) is the default algorithm of Affymetrix using PM and MM probes [73]. It is based on the robust average of log(P M −M M ∗ ) values, whereas M M ∗ means that corrections 25 CHAPTER 3. METHODS are applied to avoid M M ∗ values less or equal to 0 [71]. For background adjustment the chip is divided into a grid of sixteen rectangular regions and the lowest 2% probe intensity of each region is used for calculating the background values. Afterwards each probe intensity is adjusted by using a weighted average of each background value. The weights depend on the Euclidean distance between the probe and the centroid of the grid [67]. The RMA (Robust Multiarray Analysis) approach includes all three steps of preprocessing: background correction, quantile normalization, and summarization and uses only PM probes [67][72]. RMA is a global background adjustment, implying that PM values are corrected probe cell by probe cell of the mircoarray by using a global model for distribution [67]. This is done by fitting a Normal-Exponential mixture model and subtracting a background estimate from the PM value of each probe. Thereby it is guaranteed to get positive results. Afterwards, the values are log transformed [70]. It is demonstrated by Irizarry et al. [73] that RMA outperforms MAS5. The GCRMA (Guanine-Cytosine Robust Multiarray Analysis) approach is similar to RMA, as it also includes all preprocessing steps, and it uses the same approach for normalization and summarization. The GCRMA method is based on the additive background-multiplicative-measurement error (ABME) model for reading intensities from microarray scanners. The difference between RMA and GCRMA is that GCRMA uses sequence information [67], to describe the non-specific binding component. It is a matter of fact that guanine and cytosine have a stronger hybridization than adenine and thymine, because guanine and cytosine have three hydrogen bounds and adenine and thymine only two [71]. Naef and Magnasco [74] developed a solution for predicting specific hybridization effects by modelling probe affinities as a sum of position-dependent base effects. It is reported that GCRMA outperforms RMA and MAS5 [71][72]. Normalization The normalization step is necessary to compare measurements from different arrays, because many sources cause variations. In RMA and GCRMA quantile normalization is used to get the same empirical distribution of entities to each array. For the visualization of the result of the algorithm a quantilequantile plot is used for meaning that two data vectors having the same distribution will show a straight diagonal line, with slope 1 and intercept 0. To achieve the same distribution for two datasets, the quantiles of two data vectors are plotted against each other and then each data point is projected onto the 45-degree line [67]. 26 CHAPTER 3. METHODS Summarization The last task is summarizing the steps of preprocessing and it is necessary to combine multiple probe intensities for each probeset to produce an expression value [67]. 3.2.2 GEOquery The GEOquery package provides an easy access to files in SOFT format and it enables to handle the included information. Therefore, it supports the usage of public available high-throughput data for Bioconductor analysis tools [75]. 3.2.3 Presence/Absence calls from Negative Probesets The R package ’Presence/Absence calls from Negative Probsets’, panp, is for the generation of gene expression values and presence and absence calls. As a first measurement of chip or sample quality the detection of the number of present or absent probesets can be applied. This first filtering step in the process of analysing differentially expressed (DE) genes is only possible with two methods: MAS5 presence-absence method and panp method. MAS5 presence-absence method can be only used with PM and MM probes implying that the MAS5 preprocessing method has to be used as well [68]. Other preprocessing methods, such as RMA or GCRMA have been developed. Moreover, it is shown that MM probes may have a negative impact on the result and may cause problems. The panp method was developed to overcome this problem, because this method handles PM probes as well as PM and MM probes [67]. Affymetrix GeneChip probesets are designed based on small oligonucleotide, also known as expression sequence tags (ESTs), which are available in a public database. Some of these ESTs have the wrong strand direction making then a reverse complement. These reverse complements are called ’Negative Strand Matching Probesets’ (NSMPs) and are used as negative controls in the panp method for the detection of the presence and absence calls [68]. For applying the panp method the data of a chip have to be preprocessed using a method, like RMA, GCRMA or MAS5. The following decision making of the panp method is illustrated in figure 3.5 as expression density plot. Therefore, the probability distribution of the signal intensities of the NSMPs are calculated and further utilized to create a cumulative distribution function, which is converted to a survivor distribution in order to derive a cutoff intensity at a given p-value. The horizontal lines on the y-axis show two, by the user chosen, p-values cutoffs. The corresponding vertical lines are the interpolated p-values cutoffs into a intensity, which classifies genes as present, marginal or absent. Genes, with an itensity value below the most left cutoff line are absent, the genes with an intensity value above the most right line are present and the genes with an intensity value between the two cutoff 27 CHAPTER 3. METHODS lines are marginal. As usual, the lower the number of the p-value, the higher the significance [68]. Figure 3.5: The expression density for classifying genes a present, absent or marginal [68]. The panp method is available in R as one part of the Bioconductor package. The R function is named pa.calls and requires an input object, ExpressionSet, and one loose cutoff (default 0.02) and one tight cutoff (default 0.01). The function returns two matrices, one including the p-values and one including indicators for presence (P), marginal (M), and absent (A) [68]. 28 CHAPTER 3. METHODS 3.2.4 Linear Models for Microarray Data Linear Models for Microarray Data (limma) is a package for analysis of differentially expressed genes of data from microarray experiments. Therefore, a linear model is fitted onto expression data of each gene making it possible to analyse simple and more complex experiments in a simple manner. Expression data can be log-ratios or log-intensities from one- or two-colour channel arrays [67][70]. limma analysis starts with an already created eset dataset and needs two kinds of matrices: a design matrix and a contrast matrix. The design matrix represents the different targets of an microarray and the contrast matrix combines the coefficients of the design matrix to enable comparisons between RNA targets of interest. The rows of the design matrix represent the arrays in the experiment and the columns the coefficients. For simple comparisons it is not necessary to create a contrast matrix. The contrast matrix may be created manually or by using the command model.matrix [67][70]: Next a linear model has to be fitted on the data by using lmFit. The method lmFit combines these two matrices to get estimated values for the contrast of interests. The next step is applying empirical Bayes method, eBayes, to borrow information between the genes. At least differentially expressed genes can be shown by using the topTable method [67][70]. There are a lot of different designs available to choose and adapt the appropriate one for the given microarray data. Three of those designs will be explained more detailed: two groups comparison against a common reference, two groups comparison of single channel microarrays and comparison of paired samples [67][76]. The comparison of two groups against a common reference implies that a two-colour microarray is used where one channel contains the common reference and the other channel the two different groups to compare as it can be seen in table 3.1 adapted from [67][76]. FileName Cy3 Cy5 File 1 Ref WT File 2 Ref WT File 3 Ref Mu File 4 Ref Mu File 5 Ref Mu Table 3.1: Target file of the comparison of two groups against a common reference. The table is adapted from [67]. 29 CHAPTER 3. METHODS The design matrix contains two columns and there are two possibilities to create it: 1. The first column includes the difference of wild-type and reference and the second column the difference between mutant and wild-type. In this case the contrast matrix is not necessary, because the comparison is already included in the design matrix. 2. The second approach handles the coefficients separately and so the first column includes the comparison between mutant and reference and the second column includes the comparison between wild-type and reference. Because of the missing comparison between mutant and wild-type in the design matrix a contrast matrix has to be created. Two groups comparison of single channel microarrays are done in the same way as the comparison of two groups against a common reference. The differences of the target file can be seen in the following table 3.2 [67][76]: FileName Target File 1 WT File 2 WT File 3 Mu File 4 Mu File 5 Mu Table 3.2: Target file of the comparison of two groups. The table is adapted from [67]. The third design is the analysis of paired samples. This kind of comparison is used to compare two kinds of treatments, which means that two persons are compared directly. For instance, one person receives treatment A and the other person receives treatment B or one person is treated and the other person is the control [67][76]. Afterwards a moderated t-test is used, by including the pairs. The target frame is created as it can be seen in table 3.3 [67][76]: FileName Group Treatment File 1 1 C File 2 1 T File 3 2 C File 4 2 T File 5 3 C File 6 3 T Table 3.3: Target file of the comparison of paired samples. The table is adapted from [67]. 30 CHAPTER 3. METHODS 3.3 3.3.1 Databases ArrayExpress database ArrayExpress [77] is a public database for microarray gene expression data developed by the European Bioinformatics Institute (EBI). It is possible to submit, query and export three kinds of data: arrays, experiments, and protocols. The description of one dataset includes a short description, protocols, information about the samples and the platform, the citation, an contact to the author and a link to the GEO database. Expression data and other informations can be downloaded for further use as well as analysed and visualized online using Expression Profiler [77][78]. 3.3.2 BiGG database Biochemically, genetically and genomically (BiGG) [31] structured database of metabolic reconstructions contains ten different genome-scale metabolic models. It offers the functionalities for searching content within reactions and metabolites and exporting metabolic reconstructions as SBML files. It allows searching for metabolites, reactions, genes, proteins and literature citations. Furthermore, BiGG database offers the possibility to visualize metabolic maps showing metabolites, reactions, and text markup. 3.3.3 CheBI database Chemical Entities of Biological Interest (ChEBI) [79] database was initiated in 2002 by the European Bioinformatics Institute (EBI) with the objective of standardization of the biochemical terminology. The database is focused on small molecular compounds and offers a wide range of information. The information can be seen as a ’dictionary’, which contains ChEBI ID, ChEBI name, ChEBI ASCII name, IUPAC name (International Union of Pure and Applied Chemistry), a definition, and synonyms. The structure of the molecular compounds is shown as structural diagram, as IUPAC InChI (a nonproprietary identifier for chemical structures), InChIKey (25-character hashed version of the InChI), and as SMILES (Simplified Molecular Input Line Entry System, which is a chemical line notation) representation. Moreover, each entry includes information about mass, charge, formula, and the ChEBI Ontology as well as links to other databases. In addition ChEBI offers the possibility of downloading the database in several different file formats [79][80]. 31 CHAPTER 3. METHODS 3.3.4 GEO database The Gene Expression Omnibus [81] database contains high-throughput gene expression data and genomic hybridization data. The objective was to create a robust and flexible database with simple submission procedures and formats to cover a wide spectrum of high-throughput data [81][82]. Further it should be intuitive to query, locate, review, and download data [82]. The database structure defines a primary and a secondary database [83]. The primary database includes ’submitter-supplied data’ and is divided into three components [81][82][83]: Platform (GPL), Sample (GSM), and Series (GSE). The content of the primary database is very heterogeneous regarding content. A Platform contains a summary description of the array/sequencer and a data table explaining the array template, if it is a array-based Platform [83]. The description of the material, the experimental protocols, and a data table illustrating the abundance measurements of each feature on the corresponding Platform is available in the Sample record [82][83]. A Series record combines a set of similar Samples to be part of a study and describes the aim of the study and the design. Each component has one specific perfix followed by an unique accession number [82]. The secondary database extracts the elements, which are shared over all elements of the primary database for creating upper-level objects called GEO DataSet and Profile. A GEO DataSet includes sequencing identity tracking information of each feature on the Platform, normalized expression measurements and the text, which describes the biological source and the experimental aim. Each GEO DataSet contains a collection of related Sample records and can be identified by a unique accession number with the prefix ’GDS’. A GEO Profile is derived from the GEO DataSet and contains the expression measurements of one gene across all samples [82][83]. The GEO records and raw data files are freely available in different file formats and it is also possible to submit files [82]. 3.3.5 Human Metabolome Database The Human Metabolome Database [84] is currently the biggest organism-specific metabolomics database [85]. The database offers built in tools for searching, viewing, and extracting metabolites, biofluid concentrations, enzymes, genes, metabolic concentration data of mass spectra (MS), and Nuclear Magnetic resonance (NMR) metabolic analysis and diseases [84]. HMDB is updated every half a year, as content and coverage are growing rapidly [85]. The compounds are classified into chemical ’kingdoms’, ’classes’ and ’families’ [85]. The result of the general search is a summary table called MetaboCard containing 90 different fields of information. These information fields can be divided into chemical or physico-chemical data and biological or biomedical data. Moreover, MetaboCard offers hyperlinks to many other databases, like KEGG, ChEBI or SwissProt [84]. 32 CHAPTER 3. METHODS 3.3.6 KEGG databases The Kyoto Encyclopedia of Genes and Genomes (KEGG) [86] project was initiated in 1995 and is a database of the Japanese GenomeNet service. Since its beginning, the database has expanded significantly and includes nowadays three kinds of information: systems information, genomic information and chemical information [87]. Fifteen databases are subordinated to these three categories (table 3.4) where elements and their correspondence to a database prefixes followed by an identifier except KEGG GENES and KEGG ENZYME, which derive their ID from RefSeq (genes) and ExplorEnz (enzyme) [87]. Category Database Content Prefix Example KEGG PATHWAY Pathway map map, ko, ec, rn, (org) hsa04930 KEGG BRITE Functional hierarchies br, jp, ko, (org) ko01003 KEGG MODULE KEGG modules M, org M M00008 KEGG DISEASE Human disease H H00004 KEGG DRUG Drugs D D01441 KEGG ENVIRON Crude drugs, ect. E E00048 KEGG ORTHOLOGY KO groups K K04527 KEGG GENOME KEGG organisms T T01001 KEGG GENE Genes in high quality genomes KEGG COMPOUND Metabolites, small molecules C C00031 KEGG GLYCAN Glycans G G00109 KEGG REACTION Biochemical reactions R R00259 KEGG RPAIR Reactant pairs RP RP04458 KEGG RCLASS Reaction class RC RC00046 KEGG ENZYME Enzyme nomenclature Systems information Genomic information hsa:3634 Chemical information ec:2.7.10.1 Table 3.4: Information about the KEGG databases. The table is adapted from [87]. The KEGG PATHWAY database includes manually drawn pathways maps of interactions and reaction networks for metabolism, genetic information processing, environmental information processing, cellular processing, organismal systems, human diseases, and drug development [87][88]. The KEGG COMPOUND database provides the chemical structure of metabolites and of other chemical compounds [86] and the KEGG GLYCAN database offers the glycan structures [89]. 33 CHAPTER 4. RESULTS Chapter 4 Results This chapter describes the results of the thesis. This includes the illustration of the tissue-specific models, the gene expression analysis of adipose tissue and the analysis of reporter metabolites. The first part of the results comprises the selection of an adequate toolbox for handling genome-scale metabolic models in SBML file format. The second part of the results encompasses the creation of one adipose and one liver tissue-specific model. Therefore, adipose and liver tissue datasets were selected, preprocessed in R, and the GIMME algorithm was applied to create the tissue-specific models. These models were compared with already published adipocyte and liver tissue-specific models. The third part describes the querying of the gene expression data about obese tissue. The differential expression was applied on eight selected datasets using limma package in R. The top 10 differentially expressed genes are shown in a table with the pathways they are involved in. The fourth part is about the analysis of reporter metabolites. Therfore, the reporter features algorithm was applied on each of the gene expression data in combination with each of the three genome-scale metabolic models, EHMN, Recon 1, and adipocyte. The resulting reporter metabolites were compared using four different approaches of comparisons: (i) between all 42 output files; (ii) between all output files using one genome-scale metabolic model; (iii) using different models and the same expression data; (iv) using two kinds of expression data of one dataset (treatment vs. control), but only one model. Therefore, the metabolite IDs were manually added to the Recon 1 and adipocyte model. Moreover, the pathways of the top 10 ranked genes were compared with the top 10 ranked reporter metabolites to detect accordances. 34 CHAPTER 4. RESULTS 4.1 Comparison of toolboxes During the thesis the following toolboxes were evaluated and after testing all toolboxes (COBRA, TIGER and OptFlux toolboxes) the following conclusions were drawn. The TIGER toolbox offers the functionality of converting COBRA models into TIGER models. However, this step is very time consuming for huge metabolic networks such as the human Recon 1 or EHMN models and the toolbox was not capable of reading SBML files directly. OptFlux is a standalone program, which offers a variety of functionalities. However, there are no implemented functions for mapping gene expression data onto genome-scale metabolic models. Moreover, it does not provide as much possibilities for handling SBML models as for example the COBRA toolbox. The COBRA toolbox offers a straight forward possibility for reading SBML files. It allows adapting and extending of COBRA methods, and supports executing MATLAB methods to handle and analyse models. Based on the described advantages the COBRA toolbox was chosen for further usage. 4.2 Creating tissue-specific models This chapter is about the creation of one adipose and one liver tissue-specific model. Hence, gene expression data were selected, preprocessed, and the panp method was applied to calculate the presence and absence calls. The calculated genes with the corresponding presence and absence calls and the human Recon 1 model were used as inputs for the GIMME algorithm. The last step describes the comparison of the resulting tissue-specific models with already published adipocyte and liver models. 4.2.1 Expression data and preprocessing in R The expression data sets were taken from the GEO database after searching for adipose tissue and liver data. The data of the dataset GSE15773 (human adipocyte data) [14] and GSE15653 (human liver data) [90] were read into R as CEL files. Only one kind of samples were selected from each series: samples of insulin sensitive omental tissue (GSE15773) and samples of normal liver tissue (GSE15653). GCRMA was applied as preprocessing method, because it is reported [68] to return the best results in connection with the panp package for analysis of presence and absence calls. 4.2.2 Presence/Absence calls from Negative Probesets The R command pa.calls, from the panp package was applied, using a tight cutoff of 0.01 and a loose cutoff of 0.01 as inputs. The same cutoffs were chosen to get only presence and absence calls, because 35 CHAPTER 4. RESULTS the construction of a tissue-specific model with presence, absence, and marginal calls is more advanced. The result of the analysis is a presence or absence label for each gene of each sample. For further usage, this output has to be simplified to receive one vector containing presence and absence calls for each gene over all samples. Therefore, a strict option was chosen, which means that a gene was only classified as present, if it was labelled present in all samples. Genes in the human Recon 1 are specified using Entrez identifier. As genes in the expression datasets were encoded with manufacturer identifiers of Affymetrix and needed to be converted to be comparable. Therefore, the hgu133plus2.db and the hgu133a.db, offered in R, were used to rename the genes. The two resulting vectors, one containing the genes and one containing the presence and absence calls, were exported as CSV files. 4.2.3 Final model generation using the GIMME algorithm The GIMME algorithm requires two inputs for creating a tissue-specific model: - A genome-scale metabolic model - A data structure containing the genes and the presence and absence calls Human Recon 1 was used as genome-scale metabolic model. In order to create a precise tissue-specific model, the lower and upper bounds as well as the objective function have to be adapted appropriately. Using the explained input data the COBRA method createTissueSpecificModel reconstructs a tissue-specific model by applying the GIMME algorithm, as described in chapter 3.1.1. Based on the used input data models for adipocyte and liver were created. The resulting liver model is shown in the following table 4.1 and is compared to already existing liver models [37][44] and two complete human metabolic models [3][41]. Liver model Liver model (Jerby et al.) Liver model (Gille et al.) Recon1 EHMN Metabolites 2564 1360 1088 2766 6522 Reactions 2984 1826 2539 3743 6216 Table 4.1: Comparison of the tissue-specific liver models and the, as starting point used, human metabolic models. As it can be seen the previously published liver models differ in the number of reactions and metabolites as they were reconstructed using different approaches. Jerby et al. [37] constructed the model by applying their developed model building algorithm (MBA) [37] using the Recon 1 model as basis. The model of Gille et al. [44] was developed manually using the Recon 1 and EHMN models as starting point. It is obvious that in each model either the number metabolites or the number of reactions is 36 CHAPTER 4. RESULTS reduced as not all pathways are active in liver tissue. The newly liver model still includes a large number of metabolites and reactions, meaning that the model seems to be less precisely adapted to the liver. The following table 4.2 displays the differences between the newly created, the existing adipocyte model of Bordbar et al. [48] and the human Recon 1 model. Adipocyte model Adipocyte (Bordbar et al.) Recon1 Metabolites 2549 554 2766 Reactions 2830 649 3743 Table 4.2: Comparison of the tissue-specific adipocyte models and the, as starting point used, human metabolic models. The manually curated model of Bordbar et al. contains a significantly reduced number of reactions and metabolites in comparison to the newly constructed. 4.3 Gene expression data analysis Eight gene expression datasets were selected for mapping differential expressed genes onto genomescale metabolic models using the reporter features algorithm. The differential expression was carried out using the limma package in R. The resulting top 10 differential expressed genes are illustrated with the corresponding pathways, in which they are involved in. 4.3.1 Obtaining expression data The first step was querying ArrayExpress database to search for gene expression data, of adipose tissue datasets. Different datasets were chosen for close inspection and were downloaded for further usage. Table 4.3 shows an overview of the selected datasets. 37 Title (GSE) Platform (GPL) Design of the study Regulation of adipose tissue gene expression during different phases of a dietary weight loss program and its relationship with insulin sensitivity: Gene expression in adipose tissue during weight loss (GSE11975) [14] Agilent-012391 Whole Human Genome Oligo Microarray G4112A (GPL1708) - 48 samples Energy restriction phase (ER) with a 4-week very-low-calorie diet Weight stabilization period (WS) composed of a 2-month low-calorie diet 3 to 4 months of a weight maintenance (DI) diet Two samples per dietary phase, one before and one after the specific phase Determination of gene expression signatures of omental and subcutaneous tissue samples: Expression data from human adipose tissue (GSE15773) [15] Affymetrix Human Genome U133 Plus 2.0 Array - Biopsy samples of adipose tissue from twin pairs that had been followed for their discordance for physical activity for 32 years: 38 Genome-wide analysis of adipose tissue gene expression in twin-pairs discordant for physical activity for over 30 years (GSE20536) [91] Illumnia HumanWG-6 v3.0 expression beadchip - 12 samples - Two mono- and four dizygotic twins - Paired sample per twin pair Cross-sectional study design to compare subcutaneous adipose tissue gene expression profiles: Differences in subcutaneous adipose tissue gene expression between obese African Americans and Hispanic Youths (GSE23506) [92] Illumnia HumanHT-12 v3.0 expression beadchip - 36 sampels - 17 African American - 19 Hispanics Fourty women followed a dietary protocol consisting of an 8-week low calorie diet (LCD) and a 6-month weight maintenance phase: Subcutaneous adipose tissue: comparison of weight maintenance and weight regain following an 8-week low calorie diet (GSE24432) [93] Agilent 014850 Whole Genome Microarray 4x44K G4112F - A total of 80 sampels 20 probands were classified as weight maintainers (WM) 20 probands were classified as weight regainers (WR) 2 paired samples per person, one before and one after LCD CHAPTER 4. RESULTS 19 samples 5 insulin-resistant probands 5 insulin-sensitive probands Insulin-resistant probands and insulin-sensitive probands were paired by their BodyMass-Index - One sample of subcutaneous and omental adipose tissue of each proband Title (GSE) Platform (GPL) Design of the study Healthy lean and overweight subjects were submitted to a high fat diet during 56 days: Characterization of the initial molecular events of adipose tissue development and growth during overfeeding in humans (GSE28005) [13] Hypoxia-induced modulation of gene expression in human adipocytes (GSE34007) [94] Affymetrix Human Genome U133 Plus 2.0Array Agilent 014850 Whole Genome Microarray 4x44K G4112F - A total of 54 samples - 18 probands - 3 paired samples per proband, taken at day 0, day 14, day 56 Human adipocytes (Zen-bio cells) were incubated in hypoxic conditions (1% O2 ) for 24 h. Control human adipocytes were incubated under normoxic conditions (21% O2 ): - 8 samples - 8 biological replicates for each experimental condition Low calorie diet (LCD) containing 1200 kcal/day for three months. Following the weight reduction phase for six month follow-up period: Affymetrix Human Genome U133 Plus 2.0Array - 26 samples - 3 paired samples per proband, taken at baseline, after weight reduction, after weight maintenance phase 39 Table 4.3: Description of the chosen datasets. CHAPTER 4. RESULTS Differential gene expression in adipose tissue from obese human subjects during weight loss and weight maintenance (GSE35411) [95] CHAPTER 4. RESULTS 4.3.2 Calculation of differential expression Each SOFT file of the selected datasets was loaded into R using the GEOquery package. Subsequently, each dataset was converted to an ExpressionSet for further analysis with the limma package. Because of the differences between the selected microarray experiments, the design and analysis steps of each dataset will be explained ordered by the setup of the datasets. The dataset GSE23506 is a cross-sectional study of a two-colour microarray experiment. A reference pool is on one channel and the samples to compare are on the other channel. Hence, the dataset was analysed as a comparison of two groups against a common reference, where a design matrix was created as explained in chapter 3.2.4. Dataset GSE34007 is a one-channel mircoarray, which requires a simple comparison of two groups. The design matrix is the same as in the dataset GSE23506. The construction of the target file needed to be done as described in chapter 3.2.4. Most of the used datasets contain paired samples with measurements at different points in time, whereby this experimental setup can be a simple or more advanced paired t-test. The dataset GSE20536 contains data of twin pairs of a one-colour microarray. This implies that an ordinary paired t-test, as described in chapter 3.2.4, is applied onto the data. The dataset GSE11975 contains paired samples of a two-colour microarray and can be treated like an one-colour microarray, because of the reference pool. The following comparisons were applied: before vs. after energy restriction (ER), after energy restriction vs. after weight stabilization (WS) and before dietary intervention vs. after weight stabilization (DI). All three comparisons are disconnected, but have to be analysed within one experimental setup, leading to a more general version of a paired ttest. The design matrix is created by defining the pairs and afterwards adding columns of the specific comparisons. A contrast matrix is not needed for these paired samples as the comparisons are already included in the design matrix. The dataset GSE24432, a two-colour microarray experiment, includes four measurements against a reference pool: (i) weight maintenance - before low calorie diet vs. after low calorie diet and (ii) weight regainer - before low calorie diet vs. after low calorie diet. Based on the study setup the derived comparisons are between WM samples and WR samples. The experimental design of limma follows the same setup as the dataset GSE11975. Dataset GSE15773 is a one-colour microarray and includes samples from insulin resistant and insulin sensitive probands, whereby one probe was obtained from omental and one from subcutaneous tissue 40 CHAPTER 4. RESULTS of each proband. The comparisons describe the differences of insulin resistant against insulin sensitive subcutaneous tissue, and insulin resistant against insulin sensitive omental tissue. The dataset GSE28005, a one-colour microarray, contains data from a time series where samples were taken on day 0, day 14, and day 56. The comparisons are not predefines as in the other datasets described previously. To obtain comparable results between gene expression patterns the following comparisons were chosen: day 0 vs. day 14 and day 0 vs. day 56. Dataset GSE35411 uses the same design as dataset GSE28005 and describes data of three points in time without defined comparisons. Therefore, comparisons between baseline vs. after weight reduction and baseline vs. weight maintenance were selected. After applying the lmFit and ebayes methods, each gene of the ebayes output has a manufacturer identifier. As human metabolic models use Entrez or RefSeq identifiers, manufacturer identifiers of the genes are not applicable for further usage. Therefore, Entrez and RefSeq IDs were added for each gene querying the Platform information of the SOFT file. However, for several genes the correct Entrez or RefSeq ID could not be assigned as either no Entrez or RefSeq ID is available or multiple IDs match to one gene name. At least, the topTable function was used to summarize the results of the linear model, by creating a list of the differentially expressed genes. Therefore, the following parameters were chosen: - adjust.method : Bonferroni Hochberg - sort.by: P-value - number : Depending on the number of genes Bonferroni Hochberg was selected as adjusting method to control the false discovery rate. This implies that all genes below a threshold are selected as differentially expressed, and then controlled if the false discovery rate is less than the threshold [67]. The created list of differentially expressed genes was modified to get one list containing Entrez identifiers with the corresponding p-values and (log) fold change and one list in which RefSeq identifiers are used. In both lists those genes are deleted, where no Entrez or RefSeq identifier was available and multiple identifiers were written line-by-line. The following table 4.4 shows the top 10 differentially expressed genes, the pathways, they are involved in, and those reporter metabolites (highlighted in yellow) of the following tables (table 4.5 and 4.7), which are also involved in these pathways. 41 CHAPTER 4. RESULTS Ranking EntrezID RefSeqID GeneName 1 8365 BC010926.1 HIST1H4H - histone cluster P-value 2 55973 NM 001008406.1 3 1622 CR456956.1 2.38006191220476e-07 1, H4h Pathway Metabolites hsa05034: Alcoholism hsa05322: Systemic lupus erythematosus BCAP29 - B-cell receptor- 6.2098504477583e-07 associated protein 29 M15887.1 DBI - inhibitor diazepam binding (GABA receptor NM 001079862.1 modulator, acyl-CoA bind- NM 001079863.1 ing protein) 6.27199638434536e-07 hsa03320: PPAR signaling pathway NM 020548.5 4 55969 AF274936.1 C20orf24 - chromosome 20 BC001871.1 open reading frame 24 6.67608370868795e-07 BC004446.1 NM 018840.2 NM 199483.1 5 125 NM 000668.3 ADH1B - alcohol dehydro- 1.113560790908e-06 hsa00010: Glycolysis / Gluconeogenesis genase 1B (class I), beta C00111 C00236 polypeptide hsa00071: Fatty acid metabolism hsa00350: Tyrosine metabolism C00122 C01036 C01179 hsa00830: Retinol metabolism hsa00980: Metabolism of xenobiotics by cytochrome P450 hsa00982: Drug metabolism - cytochrome P450 hsa01100: Metabolic pathways C00097 C00111 C00122 C00199 C00236 C00606 C01036 C01179 C03684 6 23086 7 55904 AY099469.1 EXPH5 - exophilin 5 1.1786953108756e-06 MLL5 - myeloid/lymphoid or 1.50324865193418e-06 hsa00310: Lysine degradation 1.90562620638919e-06 hsa00240: Pyrimidine metabolism NM 015065.1 AY147037.1 NM 018682.3 mixed-lineage leukemia 5 NM 182931.2 8 1806 NM 000110.3 DPYD - dihydropyrimidine dehydrogenase hsa00410: beta-Alanine metabolism hsa00770: Pantothenate and CoA biosynthesis C00097 hsa00983: Drug metabolism - other enzymes hsa01100: Metabolic pathways C00097 C00111 C00122 C00199 C00236 C00606 C01036 C01179 C03684 9 9669 NM 015904.3 EIF5B - eukaryotic transla- 10 1290 BC043613.1 COL5A2 - collagen, type V, BC086874.1 alpha 2 1.94232219398723e-06 hsa03013: RNA transport 2.29350303664699e-06 hsa04510: Focal adhesion tion initiation factor 5B hsa04512: ECM-receptor interaction NM 000393.3 hsa04974: Protein digestion and absorption C00097 hsa05146: Amoebiasis Table 4.4: The top 10 differentially expressed genes from the GSE35411 (comparison ’baseline vs. after weight reduction’) dataset with the corresponding pathways and metabolites. 42 CHAPTER 4. RESULTS 4.4 Reporter metabolites analysis As mentioned in chapter 3.1.1 and 3.1.4, reporter metabolites are detected by mapping gene expression data onto genome-scale metabolic models. Therefore, different kinds of gene expression data and three different genome-scale metabolic models were used: - Human Recon 1 model - Human EHMN model - Adipocyte model, a tissue-specific model of Bordbar et al. [48] The results were used to compare the reporter metabolites in the following ways: (i) comparison of the reporter metabolites between all 42 outputs; (ii) comparison the reporter metabolites between all outputs using one genome-scale metabolic model; (iii) different models using the same expression data; (iv) using two kinds expression data of one dataset (treatment vs. control), but only one model. 4.4.1 Reporter Features Analysis The reporter features toolbox is a standalone toolbox for mapping gene expression data onto genomescale metabolic models to identify reporter metabolites. The toolbox requires three inputs as tabseparated text files: - Analysed gene expression data - Gene-reaction interaction of the metabolic model - Reaction-metabolite interaction of the metabolic model The gene expression data are the differentially expressed genes of the eight datasets. The gene-reaction interaction file describes the relationship between the genes and the reactions within a genome-scale metabolic model and needs to be created for all models, whereby the first column has to contain the reactions and the second column has to include the genes. As the human Recon 1 and adipocyte genome-scale metabolic models include reactions as well as genes the COBRA method findGenesFromReactions was used to create the text file. The EHMN model includes no information about gene IDs, which were consequently taken from the supplementary EXCEL file, which includes informations about the interactions of the EHMN model. The reaction-metabolite interaction input file describes the metabolite-reaction interactions, which is based on the Simple Interaction File (SIF) format [96]. This file can be easily created by loading the SBML file of the model into Cytoscape and exporting it as SIF file. As the toolbox cannot read SIF 43 CHAPTER 4. RESULTS files the columns have to be switched (first column has to contain the metabolites and the second the reactions), to have a valid input for the reporter features toolbox. The reporter metabolites are analysed for the differentially expressed genes of each comparison of the dataset by applying it with each genome-scale metabolic model. For all calculations the default parameters (kmax: 100, imax: 10000, reporter degree: 1, p-value cutoff: 0.05) were taken. The reporter features toolbox returns three output files for each calculation: (i) one main output file containing the ranking of the metabolites, (ii) one neighbour file containing all nodes and (iii) one neighbour file providing metabolites that have some data value associated with them in the main output file. The first part of the main output file includes up- and down-regulated metabolites, in the second part only up-, and in the third part only down-regulated metabolites. For the following analysis of the metabolites these parts were split up into three single files. 4.4.2 Adapting human Recon 1 Comparing results of the reporter features toolbox between different methods requires a unified nomenclature of metabolites. As metabolite names differ in spelling, unique identifiers have to be used. The only model containing unique identifiers is the EHMN model, where each metabolite is assigned to a KEGG COMPOUND, KEGG GLYCAN ID, or an internal ID. The human Recon 1 model does not initially include KEGG IDs of the metabolites in the SBML file, but IDs are available from the BiGG database. Consequently, KEGG COMPOUND and GLYCAN IDs from the BiGG database were validated and added to the Recon 1. However, a lot of metabolites remained without or wrong IDs requiring a manual search using their name or synonyms to get as much completely identified metabolites as possible. As some metabolites were not included in the databases, KEGG COMPOUND, KEGG GLYCAN ID, HMDB, or ChEBI, the internal ID of the EHMN model was added. Nevertheless, there are still metabolites without an ID: - IDs taken from the BiGG database: 1901 - IDs from the BiGG database, but edited: 35 - New added IDs: 218 - Number of metabolites without ID: 612 As the adipocyte model is a derivation of the human Recon 1 model, it includes the same metabolites, except some specific metabolites describing the adipose tissue. The specific metabolites couldn’t be found in any of these databases: KEGG COMPOUND, KEGG GLYCAN, HMDB, and ChEBI. 44 CHAPTER 4. RESULTS Both models were also exported from MATLAB as SBML files including the KEGG COMPOUND and GLYCAN IDs. The KEGG COMPOUND and GLYCAN IDs and the complete metabolite names were added to all those output files of the reporter features toolbox, which contain a ranking of metabolites. 4.4.3 Comparison of the reporter metabolites The reporter metabolites analysis, applied on eight datasets in combination with EHMN, Recon 1, and adipocyte genome-scale metabolic models, returns 42 output files, containing a list of reporter metabolites. Hence, there are fourteen output files per model and the datasets contain one, two or three points of measurement. Therefore, several comparisons can be applied onto the output files of the reporter features algorithm: - Comparison of the rank of the metabolites between all 42 outputs - Comparison of the rank of the metabolites between all outputs of one genome-scale metabolic model - Comparison of the top 10 ranked metabolites between the three outputs for each model using the same differential expression data - Comparison of the top 10 ranked metabolites within one dataset, that contains two points of measurements, using the same genome-scale metabolic model All output files were loaded into a SQL database to apply the comparisons between the reporter metabolites. The ranking is based on their p-values and they are compared based on the KEGG ID or their name if the KEGG ID is not available. The comparisons between the models and the expression data of one dataset lead to a large number of tables. Therefore, one comparison is illustrated in the results and the other comparisons can be found in the appendix. Overlapping metabolites over all datasets There are 186 reporter metabolites, compared by their KEGG ID, present in all 42 output files of the reporter metabolites analysis. The ranking of these reporter metabolites differs significantly between the output files. 45 CHAPTER 4. RESULTS Overlapping metabolites in each model The number of overlapping metabolites over all output files of one genome-scale metabolic model varies, because of the different included reporter metabolites in an output file. In general, more overlapping metabolites were found due to presence is needed only in one model and not in all three models. - EHMN model: 1985 overlapping reporter metabolites - Recon 1 model: 1252 overlapping reporter metabolites - Adipocyte model: 298 overlapping reporter metabolites However, the ranking of the overlapping reporter metabolites remains diverse for all used models. Differential gene expression in adipose tissue from obese human subjects during weight loss and weight maintenance (GSE35411) - Adipocyte model Adipose tissue data were used as gene expression data and therefore the results of the adipocyte model seem to be suitable as example files to present the results in this chapter. The first table (table 4.5) illustrates the top 10 reporter metabolites in comparison to the rank of these metabolites using the EHMN or Recon 1 model and the same expression data. Adipocyte model KEGG ID Metabolite name EHMN model Ranking P-value Recon 1 model Ranking P-value Ranking P-value C01036 4-Maleylacetoacetate 1 0.005185 19 0.00297981 4 0.0023486 C00536 Inorganic triphosphate 2 0.00926387 29 0.00607262 7 0.00493835 C00111 Dihydroxyacetone phosphate 3 0.0115888 91 0.0249449 C00606 3-Sulfino-L-alanine 4 0.0138944 52 C00097 L-Cysteine 5 0.0164945 109 C01179 3-(4-Hydroxyphenyl)pyruvate 6 0.0220697 C00122 Fumarate 7 0.0227082 C00236 3-Phospho-D-glyceroyl phosphate 8 C00199 D-Ribulose 5-phosphate 9 C03684 6-Pyruvoyl-5,6,7,8-tetrahydropterin 10 Number of pathways 2 1 51 0.0317427 10 0.0125323 12 0.0089768 3 0.0340762 102 0.0643997 11 1996 0.949555 640 0.539659 4 476 0.214497 87 0.053884 11 0.0243508 404 0.175407 323 0.242925 2 0.0249655 64 0.0161579 19 0.0141511 5 0.0267392 29 0.00607262 23 0.0159692 2 Table 4.5: Top 10 metabolites of the GSE35411 (comparison ’baseline vs. after weight reduction’) dataset using the adipocyte model in comparison to the EHMN and Recon 1 model. The ranking of the reporter metabolites differs between the outputs for each model based on the same expression data. Moreover, it occurs that the top 10 reporter metabolites of one model cannot be found in the ranking created with the other models and the same expression data. Another kind of comparison is the detection of differences between the reporter metabolites using two measurements within one dataset and the same model. This kind of comparison, displayed in the following table 4.6, illustrates the divergence between the two points of measurement. Whereas the comparison between the models aims to have similar results of the ranking of reporter metabolites. 46 CHAPTER 4. RESULTS base-loss KEGG ID base-main Metabolite name Ranking P-value Ranking 189 P-value C01036 4-Maleylacetoacetate 1 0.005185 C00536 Inorganic triphosphate 2 0.00926387 C00111 Dihydroxyacetone phosphate 3 0.0115888 144 0.465898 C00606 3-Sulfino-L-alanine 4 0.0138944 234 0.752851 C00097 L-Cysteine 5 0.0164945 116 0.364113 C01179 3-(4-Hydroxyphenyl)pyruvate 6 0.0220697 198 0.654706 C00122 Fumarate 7 0.0227082 108 0.331964 C00236 3-Phospho-D-glyceroyl phosphate 8 0.0243508 18 C00199 D-Ribulose 5-phosphate 9 0.0249655 209 0.682199 C03684 6-Pyruvoyl-5,6,7,8-tetrahydropterin 10 0.0267392 59 0.165882 15 0.620879 0.0329617 0.0445467 Table 4.6: GSE35411 dataset using the adipocyte model illustrating the comparison of the top 10 metabolites between baseline vs. after weight reduction and baseline vs. weight maintenance. The table 4.4 shows the pathways, in which the top 10 reporter metabolites are involved in. The top 10 reporter metabolites are involved in 35 and the top 10 genes in 22 pathways. Hence, there are five pathways (coloured yellow in table 4.4 and 4.7) in which top 10 ranked reporter metabolites as well as top 10 ranked genes are involved. hsa01100 Metabolic pathways cpd:C00097 L-Cysteine cpd:C00111 Glycerone phosphate cpd:C00122 Fumarate cpd:C00199 D-Ribulose 5-phosphate cpd:C00236 3-Phospho-D-glyceroyl phosphate cpd:C00606 3-Sulfino-L-alanine cpd:C01036 4-Maleylacetoacetate cpd:C01179 3-(4-Hydroxyphenyl)pyruvate cpd:C03684 6-Pyruvoyltetrahydropterin hsa00030 Pentose phosphate pathway cpd:C00199 D-Ribulose 5-phosphate hsa00051 Fructose and mannose metabolism cpd:C00111 Glycerone phosphate hsa00052 Galactose metabolism cpd:C00111 Glycerone phosphate hsa00130 Ubiquinone and other terpenoid-quinone biosynthesis cpd:C01179 3-(4-Hydroxyphenyl)pyruvate hsa00350 Tyrosine metabolism cpd:C00122 Fumarate cpd:C01036 4-Maleylacetoacetate cpd:C01179 3-(4-Hydroxyphenyl)pyruvate hsa00250 Alanine, aspartate and glutamate metabolism cpd:C00122 Fumarate hsa00260 Glycine, serine and threonine metabolism hsa00010 Glycolysis / Gluconeogenesis cpd:C00111 Glycerone phosphate cpd:C00236 3-Phospho-D-glyceroyl phosphate cpd:C00097 L-Cysteine hsa00330 Arginine and proline metabolism cpd:C00122 Fumarate hsa00040 Pentose and glucuronate interconversions cpd:C00111 Glycerone phosphate cpd:C00199 D-Ribulose 5-phosphate hsa00360 Phenylalanine metabolism cpd:C00122 hsa00190 Oxidative phosphorylation cpd:C00122 Fumarate cpd:C00536 Triphosphate Fumarate hsa00400 Phenylalanine, tyrosine and tryptophan biosynthesis cpd:C01179 3-(4-Hydroxyphenyl)pyruvate hsa00480 Glutathione metabolism hsa00270 Cysteine and methionine metabolism cpd:C00097 L-Cysteine cpd:C00606 3-Sulfino-L-alanine cpd:C00097 L-Cysteine hsa00561 Glycerolipid metabolism cpd:C00111 Glycerone phosphate hsa00430 Taurine and hypotaurine metabolism cpd:C00097 L-Cysteine cpd:C00606 3-Sulfino-L-alanine hsa00562 Inositol phosphate metabolism cpd:C00111 hsa00760 Nicotinate and nicotinamide metabolism cpd:C00111 Glycerone phosphate cpd:C00122 Fumarate Glycerone phosphate hsa00564 Glycerophospholipid metabolism cpd:C00111 Glycerone phosphate hsa00620 Pyruvate metabolism hsa00020 Citrate cycle (TCA cycle) cpd:C00122 cpd:C00111 Fumarate 47 Glycerone phosphate CHAPTER 4. RESULTS hsa00650 Butanoate metabolism cpd:C00122 hsa00920 Sulfur metabolism Fumarate cpd:C00097 hsa00730 Thiamine metabolism cpd:C00097 hsa00970 Aminoacyl-tRNA biosynthesis L-Cysteine cpd:C00097 L-Cysteine cpd:C00097 L-Cysteine hsa00740 Riboflavin metabolism cpd:C00199 hsa04122 Sulfur relay system D-Ribulose 5-phosphate hsa00750 Vitamin B6 metabolism cpd:C00199 hsa04974 Protein digestion and absorption D-Ribulose 5-phosphate cpd:C00097 L-Cysteine cpd:C00122 Fumarate hsa00770 Pantothenate and CoA biosynthesis cpd:C00097 L-Cysteine cpd:C03684 6-Pyruvoyltetrahydropterin L-Cysteine hsa05200 Pathways in cancer hsa00790 Folate biosynthesis hsa05211 Renal cell carcinoma cpd:C00122 Fumarate Table 4.7: The pathways in which the top 10 reporter metabolites are involved in. 48 CHAPTER 5. DISCUSSION Chapter 5 Discussion The objective of the thesis is mapping gene expression data onto genome-scale metabolic models. For this purpose, two tissue-specific models, one liver and one adipocyte model, were created and the detection of reporter metabolites was carried out. First of all an adequate toolbox for handling genome-scale metabolic models, in SBML file format, had to be chosen. The first step was the creation of one liver and one adipocyte tissue-specific model. These newly created models were also compared with already published genome-scale metabolic models of the liver and the adipocyte. The second step deals with the collection and preprocessing of gene expression data from obese tissues. The differentially expressed genes were used for further analysis with the reporter features algorithm and the top 10 differentially expressed genes were shown in a table with their corresponding pathways. The next step was the application of the reporter features algorithm to calculate the reporter metabolites using the differentially expressed genes of each dataset in combiantion with each of the three genomescale metabolic models, adipocyte, EHMN, and Recon 1. To enable a good comparison the KEGG COMPOUND and GLYCAN IDs were added manually to the Recon 1 and adipocyte model, hence the EHMN model includes them already. As final step the results of the reporter features algorithm were compared as follows: (i) between all 42 output files; (ii) between all output files using one genome-scale metabolic model; (iii) different models and the same expression data; (iv) two kinds of expression data of one dataset (treatment vs. control), but only one model. Moreover, the pathways of the top 10 ranked genes were compared with the top 10 ranked reporter metabolites to detect accordances. Most existing toolboxes for handling genome-scale metabolic models have a limited amount of implemented functionalities. Therefore, it was necessary to use different toolboxes to obtain the desired result. This implies that data had to be converted into several distinct file formats, which can induce mistakes more easily. Furthermore, the majority of these conversions had to be carried out manually 49 CHAPTER 5. DISCUSSION by implementing own methods or adapting existing algorithms. The COBRA toolbox was selected for this thesis, because it offered a huge variety of implemented functionalities. During the work, some COBRA methods proved to be better adapted to the need of using the Recon 1 model, but not for other models used in the thesis. Moreover, implementation of some basic functionalities showed to be partial, only covering specific parts of a standard used by the toolbox. For example, the import-function was only able to read the SBML files from the Recon 1 and derived tissue-specific models, but not from other models like the EHMN model (although all models were valid SBML files). The design as a MATLAB toolbox made the necessary adaption, to process all SBML files with the COBRA toolbox, feasible. The creation of a tissue-specific model was a challenging task, because many different parameters had to be chosen adequately and no existing guideline was available. The first step was the preprocessing of the chosen raw data (from CEL files) in R. The GCRMA approach was used, because it is reported to return the best results in combination with the selected panp method [68] to calculate presence and absence calls. For the panp method the cutoff values had to be chosen and it had to be decided, if it is the aim to have only present and absent calls, or present, absent, and marginal calls as input data for the GIMME algorithm. After calculating the calls for each gene in each sample file, a matrix was returned. However, the GIMME algorithm required a vector as input. Therefore, it had to be decided which conditions should be fulfilled that a gene is present, absent, or marginal. For instance, such a condition could include that a gene is only present if it is labelled as present in all samples or if it is labelled as present with a given percentage of the samples. It should be noted, that a higher number of used samples leads to the fact, that the presence of a gene over all samples is less likely. Another difficulty represented the genome-scale metabolic model. For getting a preferably good adapted tissue-specific model, the lower and upper bounds as well as the objective function had to be chosen correctly, because these parameters influence the final model. The resulting tissue-specific model is often modified to adapt it more precisely to the aiming tissue. Both newly reconstructed tissue-specific models, adipocyte and liver model, seemed to lack of accuracy, because metabolites and reactions are fewer in number in already existing models. Beside the influence of the input data, the inaccuracy occurred because of the needed precise definition of upper and lower bounds and the objective function. This step requires a huge knowledge about the genome-scale metabolic model, which is used as basis, and about the aiming tissue-specific model. For this reason, there is still room for further improvements. Eight adipose tissue datasets were selected and the already preprocessed files (SOFT files) were used for differential gene expression. This step was carried out in R using the limma package. Therefore, a design and a contrast matrix had to be constructed for each dataset. The descriptions of the different linear models for the analysis of differential expressed genes are available in [67][70] and cover a huge 50 CHAPTER 5. DISCUSSION amount of different microarray designs. The example files had more basic constructions of design and contrast matrix, whereas the used datasets included paired samples with different measurements in time. Therefore, the construction of the matrices was more advanced, but the description was illustrated comprehensible. The online version of the reporter toolbox was not working for large network files. Hence, a standalone version of the reporter features tool was requested. Using the standalone version, each of the gene expression data was combined with each genome-scale metabolic model (adipocyte, EHMN, Recon 1) to calculate the according reporter metabolites. The step of adding KEGG COMPOUND and GLYCAN IDs to the Recon 1 and adipocyte model was done by a manual search of the metabolites in different online databases. The main difficulty was that the spelling varies for each metabolite and that each metabolite name has a lot of synonyms. In most cases, not all the information about a metabolite could be found using only one database. Hence, three databases were used: KEGG, ChEBI, and HMDB. The completeness of the databases varies, meaning that a metabolite is not included in all databases and the information about a metabolite differs between the databases. Most metabolites could be found in HMDB or ChEBI database, because they contain the most synonyms for one metabolite. The corresponding KEGG ID could be detected by a offered hyperlink or by trying synonyms as input for the search in the KEGG database. The search of glycans was even more difficult, because the glycans are mainly in the ChEBI and KEGG GLYCAN database. Furthermore, their spelling was completely different between the Recon 1 model and the KEGG GLYCAN database. All in all, this was a very time consuming step which could have been repeated several times to match as many metabolites as possible to a KEGG ID. A aggravating circumstance was the heterogeneity and inconsistency between the three databases, but they are updated regularly and contain more and more metabolites. After adding the KEGG IDs to the ranked metabolites of the output files of the reporter features toolbox, a comparison between the models was carried out by using the KEGG ID and if no KEGG ID was available, the metabolite names. Thereby, the first conclusion was that the number of overlapping reporter metabolites is much higher using only fourteen output files of one model, despite comparing all 42 output files of all models. Differences in the ranking of the reporter metabolites occurred in all comparisons. One comparison had the aim of detecting the differences between the reporter metabolites using the top 10 ranked metabolites within one dataset, that contains two points of measurements. The remaining comparisons had the objective of finding the similarities between the ranking of the reporter metabolites. One reason for the difference in the ranking was that each output file of the reporter features toolbox contained an individual list of reporter metabolites. This list was influenced by the expression data and 51 CHAPTER 5. DISCUSSION genome-scale metabolic models. The three genome-scale metabolic models have a different number of metabolites, reactions and genes. Therefore, also the resulting ranked lists contained a different number of reporter metabolites. Furthermore, a reason is the incompleteness between the IDs of the Recon 1 and adipocyte models as well as the internal IDs of the EHMN model. This caused problems of matching the metabolites between the models and led to unidentified matches influencing the accuracy and completeness of the results. Despite the differences between the used models and the incomplete IDs, similar rankings of reporter metabolites could be observed in the comparison between the three models with the same underlying differential expressed genes. These comparisons were done for all 42 output files of the reporter features toolbox and are illustrated in the appendix. Moreover, also identical pathways could be detected by comparing the pathways between the top 10 ranked reporter metabolites and the top 10 ranked differentially expressed genes. To conclude, genome-scale metabolic models include a lot of biological information relating the interconnections of reactions, metabolites and genes. Therefore, they can be used as a powerful tool to study the human metabolism as well as metabolic diseases or the influence of drugs. Increasing attention is paid to the construction of tissue-specific models to get more precise metabolic models of the human key tissues and cells [97]. 52 LIST OF FIGURES List of Figures 2.1 Illustration of the different approaches for visualizing and analysing metabolic networks taken from [22]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 In this figure a bipartite graph is represented. The circles represent the metabolite vertices and the rectangles the reaction vertices. The figure is redrawn from [24]. . . . . 2.3 5 7 Example of a minimal model of glycolysis to illustrate the kinetic approach [22]. A is the reaction scheme and shows a graphical presentation of a minimal model of glycolysis. It shows that one unit of glucose (G) is converted by reactions into two units of pyruvate (P ). B shows the stoichiometric matrix N , which includes the information of the metabolites in their rows and the information about the reactions in the columns. Gx , Px , and Glx represent external metabolites, which are not in the stoichiometric matrix. C represents the reaction list of the model and D the dynamic mass-balance equation or system of differential equations [21][22]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 8 The figure depicts two similar reaction networks and the corresponding elementary flux modes. It is taken from [2]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5 The figure illustrates the five steps of flux balance analysis [25]. . . . . . . . . . . . . . . 11 2.6 The four major applications of global human metabolic models [42]. . . . . . . . . . . . 15 2.7 The reconstruction of a metabolic reaction network is done with data from literature and gene-protein-reaction (GPR) relationships from experimental data. This information is converted into a stoichiometric matrix for the following simulation step, which is an iterative process. Thereby flux balance analysis is used to calculate the steady-state fluxes through the network using constraints. The results are analysed and validated with the help of different methods. Subsequently these outcomes are date of, for example, published literature or online databases, which might be used as new input for metabolic network reconstructions [50]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1 Overview of the COBRA toolbox functionalities [5]. . . . . . . . . . . . . . . . . . . . . 18 3.2 Illustration of the step-wise procedure of the reporter metabolites algorithm [52]. . . . . 20 3.3 The figure illustrates the conversion of a COBRA model into a TIGER model [54]. . . . 21 53 LIST OF FIGURES 3.4 Illustration of the step-wise procedure of the Reporter Features algorithm [52]. . . . . . 24 3.5 The expression density for classifying genes a present, absent or marginal [68]. . . . . . . 28 54 LIST OF TABLES List of Tables 2.1 Comparison of the human metabolic models. . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Target file of the comparison of two groups against a common reference. The table is 14 adapted from [67]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Target file of the comparison of two groups. The table is adapted from [67]. . . . . . . . 30 3.3 Target file of the comparison of paired samples. The table is adapted from [67]. . . . . . 30 3.4 Information about the KEGG databases. The table is adapted from [87]. 33 4.1 Comparison of the tissue-specific liver models and the, as starting point used, human metabolic models. 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Comparison of the tissue-specific adipocyte models and the, as starting point used, human metabolic models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3 Description of the chosen datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4 The top 10 differentially expressed genes from the GSE35411 (comparison ’baseline vs. after weight reduction’) dataset with the corresponding pathways and metabolites. . . . 4.5 Top 10 metabolites of the GSE35411 (comparison ’baseline vs. after weight reduction’) dataset using the adipocyte model in comparison to the EHMN and Recon 1 model. . . 4.6 42 46 GSE35411 dataset using the adipocyte model illustrating the comparison of the top 10 metabolites between baseline vs. after weight reduction and baseline vs. weight 4.7 maintenance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 The pathways in which the top 10 reporter metabolites are involved in. . . . . . . . . . 48 55 BIBLIOGRAPHY Bibliography [1] Nielsen J. Transcriptional control of metabolic fluxes. Mol Syst Biol. 2011 Mar;7:478. [2] Klipp E, Herwig R, Kowlad A, Wierling C, Lehrach H. Systems Biology in Practice. Concepts, Implementation and Application. Wiley-VCH; 2005. [3] Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, Vo TD, et al. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci USA. 2007 Feb;104(6):1777–1782. [4] Edelman LB, Eddy JA, Price ND. In silico models of cancer. Wiley Interdiscip Rev Syst Biol Med. 2010;2(4):438–459. [5] Schellenberger J, Que R, Fleming RMT, Thiele I, Orth JD, Feist AM, et al. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc. 2011 Sep;6(9):1290–1307. [6] Palsson B, Zengler K. The challenges of integrating multi-omic data sets. Nat Chem Biol. 2010 Nov;6(11):787–789. [7] Förster J, Famili I, Fu P, Palsson B, Nielsen J. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res. 2003 Feb;13(2):244–253. [8] Ma H, Sorokin A, Mazein A, Selkov A, Selkov E, Demin O, et al. The Edinburgh human metabolic network reconstruction and its functional analysis. Mol Syst Biol. 2007;3:135. [9] Li L, Zhou X, Ching WK, Wang P. Predicting enzyme targets for cancer drugs by profiling human metabolic reactions in NCI-60 cell lines. BMC Bioinformatics. 2010;11:501. [10] Folger O, Jerby L, Frezza C, Gottlieb E, Ruppin E, Shlomi T. Predicting selective drug targets in cancer through metabolic networks. Mol Syst Biol. 2011;7:501. [11] Obesity and overweight. World Health Organization; 2012. Available from: http://www.who.int/ mediacentre/factsheets/fs311/en/index.html [cited 2012 Jul 25]. 56 BIBLIOGRAPHY [12] Obesity: preventing and managing the global epidemic. Report of a WHO consultation. World Health Organ Tech Rep Ser. 2000;894:i–xii, 1–253. [13] Alligier M, Meugnier E, Debard C, Lambert-Porcheron S, Chanseaume E, Sothier M, et al. Subcutaneous adipose tissue remodeling during the initial phase of weight gain induced by overfeeding in humans. J Clin Endocrinol Metab. 2012 Feb;97(2):E183–E192. [14] Capel F, Klimcáková E, Viguerie N, Roussel B, Vı́tková M, Kováciková M, et al. Macrophages and adipocytes in human obesity: adipose tissue gene expression and insulin sensitivity during calorie restriction and weight stabilization. Diabetes. 2009 Jul;58(7):1558–1567. [15] Hardy OT, Perugini RA, Nicoloro SM, Gallagher-Dorval K, Puri V, Straubhaar J, et al. Body mass index-independent inflammation in omental adipose tissue associated with insulin resistance in morbid obesity. Surg Obes Relat Dis. 2011;7(1):60–67. [16] Schilling CH, Letscher D, Palsson BO. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. J Theor Biol. 2000 Apr;203(3):229–248. [17] Schilling CH, Schuster S, Palsson BO, Heinrich R. Metabolic pathway analysis: basic concepts and scientific applications in the post-genomic era. Biotechnol Prog. 1999;15(3):296–303. [18] Garrett R, Grisham C. Biochemistry. Physical Science David Harris; 2005. [19] Seager S, Slabaugh M. Organic and Biochemistry for Today. 7th ed. Hartford C.; 2011. [20] Deisboeck T, Kresh J, editors. Complex Systems Science in Biomedicine. Topics in Biomedical Engineering; 2006. Springer Verlag. [21] Steuer R. Computational approaches to the topology, stability and dynamics of metabolic networks. Phytochemistry. 2007;68(16-18):2139–2151. [22] Steuer R, Junker BH. Computational Models of Metabolism: Stability and Regulation in Metabolic Networks. In: Rice SA, editor. Advances in Chemical Physics. vol. 142. Hoboken, NJ, USA: John Wiley & Sons, Inc.; 2008. . [23] Emmert-Streib F, Dehmer M. Networks for systems biology: conceptual connection of data and function. IET Syst Biol. 2011 May;5(3):185–207. [24] Dehmer M, Emmert-Streib F, Graber A, Salvador A, editors. Applied Statistics for Network Biology: Methods in Systems Biology. Wiley-VCH; 2011. To appear. [25] Orth JD, Thiele I, Palsson B. What is flux balance analysis? Nat Biotechnol. 2010 Mar;28(3):245– 248. 57 BIBLIOGRAPHY [26] Raman K, Chandra N. Flux balance analysis of biological systems: applications and challenges. Brief Bioinform. 2009 Jul;10(4):435–449. [27] Mrabet Y, Semmar N. Mathematical methods to analysis of topology, functional variability and evolution of metabolic systems based on different decomposition concepts. Curr Drug Metab. 2010 May;11(4):315–341. [28] Price ND, Reed JL, Palsson B. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat Rev Microbiol. 2004 Nov;2(11):886–897. [29] Pinchuk GE, Hill EA, Geydebrekht OV, De Ingeniis J, Zhang X, Osterman A, et al. Constraintbased model of Shewanella oneidensis MR-1 metabolism: a tool for data analysis and hypothesis generation. PLoS Comput Biol. 2010 Jun;6(6):e1000822. [30] Westerhoff HV, Palsson BO. The evolution of molecular biology into systems biology. Nat Biotechnol. 2004 Oct;22(10):1249–1252. [31] Schellenberger J, Park JO, Conrad TM, Palsson B. BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics. 2010;11:213. [32] Oh YK, Joyce AR, Palsson BO. Constraint-based Genome-Scale In Silico Models for Systems Biology. Asia Pacific Biotech News. 2006;10(3):123–136. [33] Reed JL, Vo TD, Schilling CH, Palsson BO. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 2003;4(9):R54. [34] Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, et al. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007;3:121. [35] Thiele I, Palsson B. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc. 2010;5(1):93–121. [36] Famili I, Forster J, Nielsen J, Palsson BO. Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc Natl Acad Sci USA. 2003 Nov;100(23):13134–13139. [37] Jerby L, Shlomi T, Ruppin E. Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism. Mol Syst Biol. 2010 Sep;6:401. [38] Rolfsson O, Palsson B, Thiele I. The human metabolic reconstruction Recon 1 directs hypotheses of novel human metabolic functions. BMC Syst Biol. 2011;5:155. [39] Shlomi T, Cabili MN, Herrgård MJ, Palsson B, Ruppin E. Network-based prediction of human tissue-specific metabolism. Nat Biotechnol. 2008 Sep;26(9):1003–1010. 58 BIBLIOGRAPHY [40] Shlomi T, Benyamini T, Gottlieb E, Sharan R, Ruppin E. Genome-scale metabolic modeling elucidates the role of proliferative adaptation in causing the Warburg effect. PLoS Comput Biol. 2011 Mar;7(3):e1002018. [41] Hao T, Ma HW, Zhao XM, Goryanin I. Compartmentalization of the Edinburgh Human Metabolic Network. BMC Bioinformatics. 2010;11:393. [42] Bordbar A, Palsson BO. Using the reconstructed genome-scale human metabolic network to study physiology and pathology. J Intern Med. 2012 Feb;271(2):131–141. [43] Bordbar A, Lewis NE, Schellenberger J, Palsson B, Jamshidi N. Insight into human alveolar macrophage and M. tuberculosis interactions via metabolic reconstructions. Mol Syst Biol. 2010 Oct;6:422. [44] Gille C, Bölling C, Hoppe A, Bulik S, Hoffmann S, Hübner K, et al. HepatoNet1: a comprehensive metabolic reconstruction of the human hepatocyte for the analysis of liver physiology. Mol Syst Biol. 2010 Sep;6:411. [45] Chang RL, Xie L, Xie L, Bourne PE, Palsson B. Drug off-target effects predicted using structural analysis in the context of a metabolic network model. PLoS Comput Biol. 2010;6(9):e1000938. [46] Lewis NE, Schramm G, Bordbar A, Schellenberger J, Andersen MP, Cheng JK, et al. Large-scale in silico modeling of metabolic interactions between cell types in the human brain. Nat Biotechnol. 2010 Dec;28(12):1279–1285. [47] Bordbar A, Jamshidi N, Palsson BO. iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte metabolism that can be used to simulate its physiological and patho-physiological states. BMC Syst Biol. 2011;5:110. [48] Bordbar A, Feist AM, Usaite-Black R, Woodcock J, Palsson BO, Famili I. A multi-tissue type genome-scale metabolic network for analysis of whole-body systems physiology. BMC Syst Biol. 2011;5:180. [49] Hucka M, Bergmann FT, Hoops S, Keating SM, Sahle S, Smith LP, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models.; 2010. Nature Proceedings. [50] Gianchandani EP, Chavali AK, Papin JA. The application of flux balance analysis in systems biology. Wiley Interdiscip Rev Syst Biol Med. 2010;2(3):372–382. [51] Becker SA, Palsson BO. Context-specific metabolic networks are consistent with experiments. PLoS Comput Biol. 2008 May;4(5):e1000082. 59 BIBLIOGRAPHY [52] Patil KR, Nielsen J. Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc Natl Acad Sci USA. 2005 Feb;102(8):2685–2689. [53] Herold H, Lurz B, Wohlrab J. Grundlagen der Informatik. Pearson Studium; 2006. [54] Jensen PA, Lutz KA, Papin JA. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks. BMC Syst Biol. 2011;5:147. [55] Klamt S, Saez-Rodriguez J, Gilles ED. Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst Biol. 2007;1:2. [56] Cvijovic M, Olivares-Hernández R, Agren R, Dahr N, Vongsangnak W, Nookaew I, et al. BioMet Toolbox: genome-wide analysis of metabolism. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W144–W149. [57] Jensen PA, Papin JA. Functional integration of a metabolic network model and expression data without arbitrary thresholding. Bioinformatics. 2011 Feb;27(4):541–547. [58] Rocha I, Maia P, Evangelista P, Vilaça P, Soares S, Pinto JP, et al. OptFlux: an open-source software platform for in silico metabolic engineering. BMC Syst Biol. 2010;4:45. [59] Kamp A, Schuster S. Metatool 5.0: fast and flexible elementary modes analysis. Bioinformatics. 2006 Aug;22(15):1930–1931. [60] Segrè D, Vitkup D, Church GM. Analysis of optimality in natural and perturbed metabolic networks. Proc Natl Acad Sci USA. 2002 Nov;99(23):15112–15117. [61] Shlomi T, Berkman O, Ruppin E. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc Natl Acad Sci USA. 2005 May;102(21):7695–7700. [62] Burgard AP, Pharkya P, Maranas CD. Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol Bioeng. 2003 Dec;84(6):647– 657. [63] Patil KR, Rocha I, Förster J, Nielsen J. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics. 2005;6:308. [64] Terzer M, Stelling J. Large-scale computation of elementary flux modes with bit pattern trees. Bioinformatics. 2008 Oct;24(19):2229–2235. [65] Funahashi A, Morohashi M, Kitano H. CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. 2003;1:159–162+. [66] Oliveira AP, Patil KR, Nielsen J. Architecture of transcriptional regulatory circuits is knitted over the topology of bio-molecular interaction networks. BMC Syst Biol. 2008;2:17. 60 BIBLIOGRAPHY [67] Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S, editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor (Statistics for Biology and Health). New York: Springer Science+Business Media; 2005. [68] Warren P, Taylor D, Martini PGV, Jackson J, Bienkowska J. panp: Presence-Absence Calls from Negative Strand Matching Probesets;. Under review. [69] Irizarry RA, Wu Z, Jaffee HA. Comparison of Affymetrix GeneChip expression measures. Bioinformatics. 2006 Apr;22(7):789–794. [70] Hahne F, Huber W, Gentleman R, Falcon S. Bioconductor Case Studies. 1st ed. Springer Publishing Company, Incorporated; 2008. [71] Wu Z, Irizarry RA, Gentleman R, Murillo FM, Spencer F. A Model Based Background Adjustment for Oligonucleotide Expression Arrays. Johns Hopkins University, Dept of Biostatistics Working Papers Working Paper 1. 2004;. [72] Wu Z, Irizarry RA. Stochastic models inspired by hybridization theory for short oligonucleotide arrays. J Comput Biol. 2005;12(6):882–893. [73] Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003 Apr;4(2):249–264. [74] Naef F, Magnasco MO. Solving the riddle of the bright mismatches: Labeling and effective binding in oligonucleotide arrays. Phys Rev E. 2003 Jul;68:011906. [75] Sean D, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics. 2007 Jul;23(14):1846–1847. [76] Smyth GK, Ritchie M, Thorne N, Wettenhall J, Shi W. limma: Linear Models for Microarray Data User Guide; 2010. [77] Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, et al. ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 2003 Jan;31(1):68–71. [78] Parkinson H, Sarkans U, Shojatalab M, Abeygunawardena N, Contrino S, Coulson R, et al. ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 2005 Jan;33(Database issue):D553–D555. [79] Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 2008 Jan;36(Database issue):D344–D350. 61 BIBLIOGRAPHY [80] Matos P, Alcántara R, Dekker A, Ennis M, Hastings J, Haug K, et al. Chemical Entities of Biological Interest: an update. Nucleic Acids Res. 2010 Jan;38(Database issue):D249–D254. [81] Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002 Jan;30(1):207–210. [82] Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009 Jan;37(Database issue):D885–D890. [83] Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, et al. NCBI GEO: archive for functional genomics data sets–10 years on. Nucleic Acids Res. 2011 Jan;39(Database issue):D1005– D1010. [84] Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, et al. HMDB: the Human Metabolome Database. Nucleic Acids Res. 2007 Jan;35(Database issue):D521–D526. [85] Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, et al. HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res. 2009 Jan;37(Database issue):D603–D610. [86] Kanehisa M, Goto S, Kawashima S, Nakaya A. The KEGG databases at GenomeNet. Nucleic Acids Res. 2002 Jan;30(1):42–46. [87] Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012 Jan;40(Database issue):D109–D114. [88] Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006 Jan;34(Database issue):D354–D357. [89] Kanehisa M. Representation and analysis of molecular networks involving diseases and drugs. Genome Inform. 2009 Oct;23(1):212–213. [90] Pihlajamäki J, Boes T, Kim EY, Dearie F, Kim BW, Schroeder J, et al. Thyroid hormone-related regulation of gene expression in human fatty liver. J Clin Endocrinol Metab. 2009 Sep;94(9):3521– 3529. [91] Leskinen T, Rinnankoski-Tuikka R, Rintala M, Seppänen-Laakso T, Pöllänen E, Alen M, et al. Differences in muscle and adipose tissue gene expression and cardio-metabolic risk factors in the members of physical activity discordant twin pairs. PLoS One. 2010;5(9). [92] Lê KA, Mahurkar S, Alderete TL, Hasson RE, Adam TC, Kim JS, et al. Subcutaneous adipose tissue macrophage infiltration is associated with hepatic and visceral fat deposition, hyperinsulinemia, and stimulation of NF-kappaB stress pathway. Diabetes. 2011 Nov;60(11):2802–2809. 62 BIBLIOGRAPHY [93] Mutch DM, Pers TH, Temanni MR, Pelloux V, Marquez-Quiñones A, Holst C, et al. A distinct adipose tissue gene expression response to caloric restriction predicts 6-mo weight maintenance in obese subjects. Am J Clin Nutr. 2011 Dec;94(6):1399–1409. [94] Mazzatti D, Lim FL, O’Hara A, Wood IS, Trayhurn P. A microarray analysis of the hypoxiainduced modulation of gene expression in human adipocytes. Arch Physiol Biochem. 2012 Jul;118(3):112–120. [95] Johansson LE, Danielsson AP, Parikh H, Klintenberg M, Norström F, Groop L, et al. Differential gene expression in adipose tissue from obese human subjects during weight loss and weight maintenance. Am J Clin Nutr. 2012 Jul;96(1):196–207. [96] Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366–2382. [97] Mardinoglu A, Nielsen J. Systems medicine and metabolic modelling. Feb;271(2):142–154. 63 J Intern Med. 2012 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Appendix A Results of all selected datasets The appendix includes the results of the thesis (as presented in chapter 4.3 and 4.4) for each of the eight selected datasets, which are described in chapter 4.3.1. This chapter is structured according to the eight datasets and each of these subchapters include the following results: The first part illustrates the results of the differential expression using the limma package in R. The top 10 differentially expressed genes, the pathways in which they are involved in, as well as those top 10 reporter metabolites, which are included in the pathways, are shown in a table. The second part and third part include the comparisons of the results of the reporter metabolites analysis: (i) using different models and the same expression data and (ii) using two kinds of expression data of one dataset and one model. Therefore, the reporter features algorithm was applied on each of the gene expression data in combination with each of the three genome-scale metabolic models, adipocyte, EHMN and Recon 1. 64 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.1 Gene expression in adipose tissue during weight loss (GSE11975) Regulation of adipose tissue gene expression during different phases of a dietary weight loss program and its relationship with insulin sensitivity [14]: - Energy restriction phase (ER) with a 4-week very-low-calorie diet - Weight stabilization period (WS) composed of a 2-month low-calorie diet - 3 to 4 months of a weight maintenance (DI) diet - Two samples per dietary phase, one before and one after the specific phase The following comparisons were applied for the calculation of the differentially expressed genes: (i) before vs. after energy restriction (ER), (ii) after energy restriction vs. after weight stabilization (WS), and (iii) before dietary intervention vs. after weight stabilization (DI). A.1.1 Differentially expressed genes The following tables show the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Before vs. after energy restriction (ER) Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 2331 fibromodulin 2.89100446064137e-10 aldolase C, 1.27029911564754e-08 NM 002023 2 230 NM 005165 hsa00010: Glycolysis/ C00118 fructose- Gluconeogenesis C00236 bisphosphate hsa00030: Pentose phos- C00118 C00577 C00118 C00577 hsa01100: Metabolic path- C00026 C00010 C00003 ways C00118 C00122 C00004 C00122 C00149 C00122 C00149 C00311 C00149 C00236 C00577 C00266 C05272 C01944 C00422 phate pathway hsa00051: Fructose and mannose metabolism C05271 C05276 3 85329 NM 033101 lectin, galactosidebinding, 1.56113172754231e-08 soluble, 12 4 26292 NM 012333 5 4311 NM 007289 c-myc binding 7.31809251898118e-08 protein membrane metallo- 7.68345348715221e-08 endopeptidase hsa04614: Reninangiotensinsystem hsa04640: Hematopoietic cell lineage hsa04974: Protein digestion and absorption hsa05010: Alzheimer’s disease 65 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 6 5918 NM 002888 retinoic acid re- 8.69645640007105e-08 ceptor responder 1 (tazarotene induced) 7 4015 lysyl oxidase 1.01198150445751e-07 selenium binding 1.20641993480657e-07 NM 002317 8 8991 NM 003944 9 10 protein 1 200942 kelch domain NM 173546 containing 8B 6678 secreted protein, NM 003118 acidic, cysteine- 1.59842794800541e-07 1.72515267323231e-07 rich (osteonectin) Table A.1: The top 10 differentially expressed genes from the comparison ’before vs. after energy restriction’ with the corresponding pathways and reporter metabolites. After energy restriction vs. after weight stabilization (WS) Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 5918 NM 002888 retinoic acid re- 1 1.96312275773292e-12 ceptor responder (tazarotene induced) 2 2512 NM 000146 ferritin, light 2.58942973403245e-11 polypeptide hsa00860: Porphyrin and chlorophyll metabolism hsa04978: Mineral absorpt- C00124 tion 3 2495 ferritin, heavy NM 002032 polypeptide 1 5.77157843379722e-10 hsa00860: Porphyrin and chlorophyll metabolism hsa04978: Mineral absorp- C00124 tion 4 2331 fibromodulin 8.83712364280302e-10 transferrin 1.22261082954918e-09 NM 002023 5 7018 NM 001063 6 7 53940 C00124 tion ferritin, heavy NM 031894 polypeptide-like 17 85329 lectin, galactoside- NM 03310 hsa04978: Mineral absorp- binding, 2.89825261692573e-09 3.40382326546594e-09 soluble, 12 8 6720 sterol regulatory NM 001005291 element binding 5.69684281972593e-09 hsa04910: Insulin signaling pathway transcription factor 1 9 8991 NM 003944 10 3693 selenium binding 5.72487175899984e-09 protein 1 integrin, beta 5 5.86078392028941e-09 NM 002213 hsa04145: Phagosome hsa04510: Focal adhesion hsa04512: ECM-receptor interaction hsa04810: Regulation of actin cytoskeleton hsa05410: Hypertrophic cardiomyopathy (HCM) hsa05412: Arrhythmogenic right ventricular cardiomyopathy (ARVC) hsa05414: Dilated cardiomyopathy Table A.2: The top 10 differentially expressed genes from the comparison ’after energy restriction vs. after weight stabilization’ with the corresponding pathways and reporter metabolites. 66 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Before dietary intervention vs. after weight stabilization (DI) Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 54968 NM 001040613 2 3 transmembrane 950 scavenger receptor NM 005506 class B, member 2 2512 NM 000146 2.51811309478267e-08 protein 70 ferritin, light 6.8712935066448e-08 hsa04142: Lysosome 7.50862402819666e-08 hsa00860: Porphyrin and polypeptide C00430 chlorophyll metabolism hsa04978: Mineral absorp- C00080 tion 4 2495 ferritin, heavy NM 002032 polypeptide 1 9.93681572742043e-08 hsa00860: Porphyrin and C00430 chlorophyll metabolism hsa04978: Mineral absorp- C00080 tion 5 4794 nuclear factor of NM 004556 kappa light poly- 1.36610708335739e-07 hsa04660: T cell receptor signaling pathway peptide gene en- hsa04662: B cell receptor hancer in B-cells signaling pathway inhibitor, epsilon hsa04722: Neurotrophin signaling pathway hsa04920: Adipocytokine C00083 signaling pathway C00162 hsa05169: Epstein-Barr virus infection 6 5476 cathepsin A 3.40254011668576e-07 NM 000308 hsa04142: Lysosome hsa04614: Reninangiotensin system 7 11346 synaptopodin 5.36612405808613e-07 retinoic acid 6.47596378555396e-07 NM 007286 8 10742 NM 021785 9 3200 induced 2 homeobox A3 6.813484e-07 obscurin-like 1 7.315764e-07 NM 153631 10 23363 NM 001173431 Table A.3: The top 10 differentially expressed genes from the comparison ’before dietary intervention vs. after weight stabilization’ with the corresponding pathways and reporter metabolites. 67 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.1.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. Before vs. after energy restriction (ER) Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value eicosadienoyl-CoA (C20:2CoA, n-6) 1 0.000815611 NA NA NA NA NA NA NA 0.0000793772 53 0.0240723 octadecadienoyl-CoA (C18:2CoA, n-6) 2 0.00178503 NA C00149 L-Malate 3 0.00269335 6 C00026 2-Oxoglutarate 4 0.00282039 1069 0.520135 443 0.382618 C00236 3-Phospho-D-glyceroyl phosphate 5 0.0034244 1736 0.810936 193 0.139036 1-Acyl-sn-glycerol 3-phosphate, adipocyte 6 0.00742544 NA NA NA NA octadecatrienoyl-CoA (C18:3CoA, n-6) 7 0.00803552 NA NA NA NA stearidonyl coenzyme A (C18:4CoA, n-3) 7 0.00803552 NA NA NA NA Glyceraldehyde 3-phosphate 8 0.00938127 1084 0.529239 16 0.00349265 docosenoyl-CoA (C22:1CoA, n-9) 9 0.00959812 NA NA NA NA C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 9 0.00959812 515 0.243862 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 9 0.00959812 56 C00122 Fumarate 10 0.00963716 592 C00118 1010 0.820441 0.00651582 6 0.000563763 0.291419 2 0.000196622 Table A.4: The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank 48 P-value Octanoyl-CoA 1 0.0000464231 C02249 Arachidonyl-CoA 2 0.00005543 NA NA 726 C00577 D-Glyceraldehyde 3 0.0000747738 NA NA 99 0.0592681 C05271 trans-Hex-2-enoyl-CoA 4 0.000075209 NA NA NA NA C05276 trans-Oct-2-enoyl-CoA 4 0.000075209 NA NA NA NA C00010 CoA 5 0.000079096 12 0.0142852 431 0.362262 C00149 (S)-Malate 6 0.0000793772 32 0.0647696 53 C00122 Fumarate 7 0.000255348 10 0.00963716 CE5312 6(R)-hydroxy-tetradeca-2E,8Z-dienoate 8 0.000293455 NA NA NA NA CE5324 6(S)-hydroxy-tetradeca-2E,8Z-dienoate 8 0.000293455 NA NA NA NA CE5315 8(R)-hydroxy-hexadeca-2E,6E,10Z-trienoate 8 0.000293455 NA NA NA NA CE5327 8(S)-hydroxy-hexadeca-2E,6E,10Z-trienoate 8 0.000293455 NA NA NA NA CE0852 palmitoleoyl-CoA 9 0.000319094 NA NA NA NA C00311 Isocitrate 0.0258153 39 0.017695 0.00035769 19 52 P-value C01944 10 0.111422 Recon 1 model Rank 0.0240429 0.640881 0.0240723 2 0.000196622 Table A.5: The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank C00004 Nicotinamide adenine dinucleotide - reduced 1 0.000188955 126 C00122 Fumarate 2 0.000196622 10 C00003 Nicotinamide adenine dinucleotide 3 0.000201739 126 C00149 L-Malate 4 0.000416562 32 tetracosahexaenoyl coenzyme A 5 0.000425598 NA C00510 Octadecenoyl-CoA (n-C18:1CoA) 6 0.000563763 63 C16218 trans-Octadec-2-enoyl-CoA 6 0.000563763 vaccenyl coenzyme A 6 triacylglycerol (homo sapiens) P-value 0.420891 EHMN model Rank 99 P-value 0.0164511 0.00963716 592 0.291419 0.420891 140 0.0324125 0.0647696 6 0.0000793772 NA NA NA 0.194446 56 0.00651582 NA NA NA NA 0.000563763 NA NA NA NA 7 0.000585207 NA NA 1159 R total Coenzyme A 8 0.00117564 NA NA NA NA C01181 4-Trimethylammoniobutanoate 9 0.00138316 NA NA 757 0.37199 C00266 Glycolaldehyde 10 0.00142146 NA NA 926 0.462381 C00422 0.565755 Table A.6: The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 68 APPENDIX A. RESULTS OF ALL SELECTED DATASETS After energy restriction vs. after weight stabilization (WS) Adipocyte model KEGG ID C00010 Metabolite name EHMN model Rank P-value Coenzyme A 1 0.00043073 Rank P-value 634 0.289467 Recon 1 model Rank P-value 418 0.355848 eicosadienoyl-CoA (C20:2CoA, n-6) 2 0.000529598 NA NA NA NA C00083 Malonyl-CoA 3 0.000983564 199 0.0494886 541 0.466681 C00100 Propanoyl-CoA (C3:0CoA) 4 0.00127582 7 1145 0.924431 octadecadienoyl-CoA (C18:2CoA, n-6) 5 0.00257569 NA NA NA NA average fatty-acyl CoA, human adipocyte 6 0.00324056 NA NA NA NA 1-Acyl-sn-glycerol 3-phosphate, adipocyte 7 0.00343025 NA NA NA NA Acetate 8 0.00537747 616 0.27988 199 0.145767 docosenoyl-CoA (C22:1CoA, n-9) 9 0.00762508 NA NA NA NA C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 9 0.00762508 1014 0.493451 709 0.591968 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 9 0.00762508 460 0.187829 4 C00163 Propionate 10 0.00966763 2066 0.965631 13 C00033 0.000292561 0.000833699 0.00543786 Table A.7: The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value C00010 CoA 1 0.0000442993 1 C00116 Glycerol 2 0.0000739748 219 C00022 Pyruvate 3 0.0000846602 40 C00124 D-Galactose 4 0.000107323 NA NA C00001 H2O 5 0.000130733 198 0.716951 CE2432 trans-2-cis,cis-5,8-tetradecatrienoyl-CoA 6 0.000231587 NA NA C00100 Propanoyl-CoA 7 0.000292561 78 0.240532 C00630 2-Methylpropanoyl-CoA 8 0.000310565 42 C00149 (S)-Malate 9 0.000548462 25 C00256 (R)-Lactate 10 0.000578654 NA Recon 1 model Rank P-value 0.00043073 418 0.355848 0.774998 972 0.792624 0.0811011 987 0.806428 36 0.0144961 1115 NA 0.902857 NA 1145 0.924431 0.0843622 281 0.225949 0.0340194 33 NA 0.0127139 547 0.471218 Table A.8: The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank C00422 triacylglycerol (homo sapiens) 1 0.000434381 NA C00665 D-Fructose 2,6-bisphosphate 2 0.000459701 60 C00412 Stearoyl-CoA (n-C18:0CoA) 3 0.000772334 106 C00510 Octadecenoyl-CoA (n-C18:1CoA) 4 0.000833699 C16218 trans-Octadec-2-enoyl-CoA 4 vaccenyl coenzyme A P-value NA EHMN model Rank P-value 446 0.179351 0.138369 1657 0.774421 0.370419 525 0.224664 106 0.370419 460 0.187829 0.000833699 NA NA NA NA 4 0.000833699 NA NA NA NA tetracosahexaenoyl coenzyme A 5 0.000983564 NA NA NA NA C00681 lysophosphatidic acid (homo sapiens) 6 0.00173502 NA NA 158 0.0345668 C00581 Guanidinoacetate 7 0.00185154 NA NA 22 C01149 4-Trimethylammoniobutanal 8 0.00303461 NA NA 584 0.261257 C00671 (S)-3-Methyl-2-oxopentanoate 9 0.00357221 34 0.0714451 1038 0.501129 C00141 3-Methyl-2-oxobutanoate 9 0.00357221 34 0.0714451 30 0.00348937 C00233 4-Methyl-2-oxopentanoate 9 0.00357221 34 0.0714451 30 0.00348937 C00122 Fumarate 10 0.00409946 16 0.0223838 345 0.00237982 0.126616 Table A.9: The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 69 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Before dietary intervention vs. after weight stabilization (DI) Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value C00236 3-Phospho-D-glyceroyl phosphate 1 0.000241325 721 0.361765 149 0.0980095 C00365 dUMP 2 0.000882425 C00631 D-Glycerate 2-phosphate 3 0.00169653 2068 156 0.0520995 0.966784 254 14 0.185858 C00033 Acetate 4 0.00896914 2009 0.943246 273 0.206262 C00186 L-Lactate 4 0.00896914 1610 0.777097 1012 0.843079 C00364 dTMP 5 0.00989853 1278 0.640701 24 C01342 Ammonium 6 0.0155426 1423 0.693694 635 C11455 4,4-dimethylcholesta-8,14,24-trienol 7 0.0167523 191 0.068952 32 C00080 H+ 8 0.0304529 1900 0.906596 667 0.555862 C00197 3-Phospho-D-glycerate 9 0.0427859 326 0.146362 973 0.803022 C00021 S-Adenosyl-L-homocysteine 10 0.0432681 671 0.340687 436 0.346229 0.0100781 0.0153494 0.52276 0.0223548 Table A.10: The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value CE1918 5-hydroxytryptophol 1 0.000124565 NA NA CE2252 3-oxooctadecanoyl-CoA 2 0.000475843 NA C00010 CoA 2 0.000475843 97 C00083 Malonyl-CoA 2 0.000475843 C00154 Palmitoyl-CoA 2 C01003 Myosin light chain C03875 Recon 1 model Rank P-value NA NA NA NA NA 0.371356 255 0.185966 126 0.444287 344 0.269801 0.000475843 151 0.543014 407 0.323938 3 0.000720087 NA NA NA NA Myosin light chain phosphate 3 0.000720087 NA NA NA NA C00249 Hexadecanoic acid 4 0.000742868 111 0.399068 288 0.216668 C06412 Palmitoyl-protein 4 0.000742868 NA NA NA NA C00001 H2O 5 0.00112052 244 0.860932 686 0.570013 C05889 Undecaprenyl-diphospho-N-acetylmuramoyl- 6 0.00119732 NA NA NA NA 6 0.00119732 NA NA NA NA 6 0.00119732 NA NA NA NA 6 0.00119732 NA NA NA NA 6 0.00119732 NA NA NA NA 6 0.00119732 NA NA NA NA 7 0.00129634 NA NA 785 0.661467 (N-acetylglucosamine)-L C05890 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05893 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05894 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05898 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05899 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C01290 beta-D-Galactosyl-1,4-beta-Dglucosylceramide C01582 Galactose 7 0.00129634 NA NA NA NA C13856 2-Arachidonoylglycerol 8 0.00141498 NA NA NA NA C00162 Fatty acid 9 0.00202817 NA NA NA NA CE3481 1-lyso-2-arachidonoyl-phosphatidate 10 0.00226329 NA NA NA NA C02960 Ceramide 1-phosphate 10 0.00226329 NA NA 718 0.596105 C00836 Sphinganine 10 0.00226329 175 0.6127 282 0.214305 C00319 Sphingosine 10 0.00226329 NA NA 756 0.637543 Table A.11: The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon 1 model. 70 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank C00214 Thymidine 1 0.00121664 21 C00430 5-Amino-4-oxopentanoate 2 0.00207101 NA C00365 dUMP 3 0.00225587 2 C00740 D-Serine 4 0.00282043 NA C00671 (S)-3-Methyl-2-oxopentanoate 5 0.00311572 32 C00141 3-Methyl-2-oxobutanoate 5 0.00311572 C00233 4-Methyl-2-oxopentanoate 5 C00153 Nicotinamide C00108 P-value 0.076193 NA 0.000882425 NA EHMN model Rank 560 12 156 P-value 0.279837 0.00260582 0.0520995 1092 0.548522 0.106318 227 0.087367 32 0.106318 18 0.00382342 0.00311572 32 0.106318 18 0.00382342 6 0.00403591 NA NA 972 Anthranilate 7 0.00444249 NA NA 81 0.0224737 C05653 N-Formylanthranilate 7 0.00444249 NA NA 81 0.0224737 C00427 Prostaglandin H2 8 0.0049088 NA NA 775 C00147 Adenine 9 0.00650209 NA NA 36 0.00763469 C00262 Hypoxanthine 9 0.00650209 NA NA 36 0.00763469 C00294 Inosine 9 0.00650209 NA NA 36 0.00763469 C00672 2-Deoxy-D-ribose 1-phosphate 10 0.00752422 NA NA 64 0.0158303 0.483973 0.385616 Table A.12: The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN model. A.1.3 Comparison between expression data The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model. Adipocyte DI KEGG ID Metabolite name ER Rank P-value Rank C00236 3-Phospho-D-glyceroyl phosphate 1 0.000241325 5 C00365 dUMP 2 0.000882425 C00631 D-Glycerate 2-phosphate 3 0.00169653 C00033 Acetate 4 C00186 L-Lactate C00364 WS P-value Rank P-value 0.0034244 252 0.868703 140 0.484077 138 0.497858 218 0.797504 65 0.172218 0.00896914 84 0.276577 8 4 0.00896914 172 0.641447 44 dTMP 5 0.00989853 245 0.864606 147 0.525764 C01342 Ammonium 6 0.0155426 284 0.96675 52 0.110458 C11455 4,4-dimethylcholesta-8,14,24-trienol 7 0.0167523 184 0.688295 22 0.0284681 C00080 H+ 8 0.0304529 274 0.928399 282 0.938686 C00197 3-Phospho-D-glycerate 9 0.0427859 130 0.438251 59 0.137453 C00021 S-Adenosyl-L-homocysteine 10 0.0432681 161 0.582947 136 0.495402 0.00537747 0.0857472 Table A.13: The comparison of the top 10 reporter metabolites between ’before dietary intervention vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the adipocyte model. 71 APPENDIX A. RESULTS OF ALL SELECTED DATASETS ER KEGG ID Metabolite name DI Rank P-value Rank WS P-value eicosadienoyl-CoA (C20:2CoA, n-6) 1 0.000815611 192 0.670054 Rank P-value 2 0.000529598 0.00257569 octadecadienoyl-CoA (C18:2CoA, n-6) 2 0.00178503 183 0.650181 5 C00149 L-Malate 3 0.00269335 282 0.975743 25 C00026 2-Oxoglutarate 4 0.00282039 121 0.427831 188 0.673654 C00236 3-Phospho-D-glyceroyl phosphate 5 0.0034244 0.000241325 252 0.868703 1-Acyl-sn-glycerol 3-phosphate, adipocyte 6 0.00742544 112 0.399639 7 octadecatrienoyl-CoA (C18:3CoA, n-6) 7 0.00803552 141 0.494175 23 0.0299012 stearidonyl coenzyme A (C18:4CoA, n-3) 7 0.00803552 141 0.494175 23 0.0299012 Glyceraldehyde 3-phosphate 8 0.00938127 12 docosenoyl-CoA (C22:1CoA, n-9) 9 0.00959812 170 0.605169 9 0.00762508 C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 9 0.00959812 170 0.605169 9 0.00762508 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 9 0.00959812 200 0.714021 106 C00122 Fumarate 10 0.00963716 261 0.908253 16 C00118 1 0.0507808 169 0.0340194 0.00343025 0.595346 0.370419 0.0223838 Table A.14: The comparison of the top 10 reporter metabolites between ’before vs. after energy restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the adipocyte model. WS KEGG ID C00010 Metabolite name DI Rank P-value Coenzyme A 1 0.00043073 Rank ER P-value 97 0.371356 Rank 12 P-value 0.0142852 eicosadienoyl-CoA (C20:2CoA, n-6) 2 0.000529598 192 0.670054 1 C00083 Malonyl-CoA 3 0.000983564 126 0.444287 26 0.048913 C00100 Propanoyl-CoA (C3:0CoA) 4 0.00127582 235 0.82635 91 0.303512 octadecadienoyl-CoA (C18:2CoA, n-6) 5 0.00257569 183 0.650181 2 average fatty-acyl CoA, human adipocyte 6 0.00324056 125 0.443888 15 1-Acyl-sn-glycerol 3-phosphate, adipocyte 7 0.00343025 112 0.399639 6 Acetate 8 0.00537747 136 0.472094 84 docosenoyl-CoA (C22:1CoA, n-9) 9 0.00762508 170 0.605169 9 0.00959812 C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 9 0.00762508 170 0.605169 9 0.00959812 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 9 0.00762508 200 0.714021 63 0.194446 C00163 Propionate 10 0.00966763 230 0.802412 109 0.384904 C00033 0.000815611 0.00178503 0.0164075 0.00742544 0.276577 Table A.15: The comparison of the top 10 reporter metabolites between ’after energy restriction vs. after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’before vs. after energy restriction (ER)’ based on the adipocyte model. 72 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN DI KEGG ID Metabolite name ER Rank P-value Rank WS P-value 0.573729 Rank P-value CE1918 5-hydroxytryptophol 1 0.000124565 1174 88 0.0126586 CE2252 3-oxooctadecanoyl-CoA 2 0.000475843 251 C00010 CoA 2 0.000475843 121 0.0861474 257 0.0832488 0.0241291 634 C00083 Malonyl-CoA 2 0.000475843 144 0.289467 0.0333697 199 C00154 Palmitoyl-CoA 2 0.000475843 29 0.0494886 0.00234422 102 C01003 Myosin light chain 3 0.000720087 1507 0.717333 1930 0.015661 0.90093 C03875 Myosin light chain phosphate 3 0.000720087 1507 0.717333 1930 0.90093 C00249 Hexadecanoic acid 4 0.000742868 516 0.244235 183 C06412 Palmitoyl-protein 4 0.000742868 1280 0.619202 16 C00001 H2O 5 0.00112052 1991 0.928843 1215 C05889 Undecaprenyl-diphospho-N-acetylmuramoyl- 6 0.00119732 1129 0.552433 19 0.00180818 6 0.00119732 1129 0.552433 19 0.00180818 6 0.00119732 1129 0.552433 19 0.00180818 6 0.00119732 1129 0.552433 19 0.00180818 6 0.00119732 1129 0.552433 19 0.00180818 6 0.00119732 1129 0.552433 19 0.00180818 7 0.00129634 1460 0.697697 1240 0.0410749 0.00146671 0.580405 (N-acetylglucosamine)-L C05890 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05893 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05894 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05898 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C05899 Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L C01290 beta-D-Galactosyl-1,4-beta-D- 0.591959 glucosylceramide C01582 Galactose 7 0.00129634 1473 0.702801 55 C13856 2-Arachidonoylglycerol 8 0.00141498 1644 0.779317 1482 0.00724312 0.696163 C00162 Fatty acid 9 0.00202817 1947 0.910556 827 0.400344 CE3481 1-lyso-2-arachidonoyl-phosphatidate 10 0.00226329 599 0.293282 246 0.0780639 C02960 Ceramide 1-phosphate 10 0.00226329 1644 0.779317 1482 0.696163 C00836 Sphinganine 10 0.00226329 2096 0.989158 1908 0.886791 C00319 Sphingosine 10 0.00226329 1741 0.813051 1793 0.830949 Table A.16: The comparison of the top 10 reporter metabolites between ’before dietary intervention vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the EHMN model. ER KEGG ID Metabolite name DI Rank P-value Rank 1569 WS P-value P-value C01944 Octanoyl-CoA 1 0.0000464231 C02249 Arachidonyl-CoA 2 0.00005543 C00577 D-Glyceraldehyde 3 0.0000747738 C05271 trans-Hex-2-enoyl-CoA 4 0.000075209 1546 0.747177 69 0.0101871 C05276 trans-Oct-2-enoyl-CoA 4 0.000075209 1546 0.747177 69 0.0101871 C00010 CoA 5 0.000079096 962 0.478043 634 C00149 (S)-Malate 6 0.0000793772 2084 0.978406 9 C00122 Fumarate 7 0.000255348 1860 0.886873 345 CE5312 6(R)-hydroxy-tetradeca-2E,8Z-dienoate 8 0.000293455 1185 0.592985 38 0.00465786 CE5324 6(S)-hydroxy-tetradeca-2E,8Z-dienoate 8 0.000293455 1185 0.592985 38 0.00465786 CE5315 8(R)-hydroxy-hexadeca-2E,6E,10Z-trienoate 8 0.000293455 1185 0.592985 38 0.00465786 CE5327 8(S)-hydroxy-hexadeca-2E,6E,10Z-trienoate 8 0.000293455 1185 0.592985 38 0.00465786 CE0852 palmitoleoyl-CoA 9 0.000319094 91 C00311 Isocitrate 10 0.00035769 258 16 1591 0.75662 Rank 1144 0.548302 0.101004 789 0.379997 0.00316358 773 0.367531 0.289467 0.000548462 0.126616 0.0260006 387 0.141423 0.765921 144 0.0286262 Table A.17: The comparison of the top 10 reporter metabolites between ’before vs. after energy restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the EHMN model. 73 APPENDIX A. RESULTS OF ALL SELECTED DATASETS WS KEGG ID Metabolite name DI ER Rank P-value Rank P-value 962 0.478043 0.0368215 Rank 121 P-value C00010 CoA 1 0.0000442993 0.0241291 C00116 Glycerol 2 0.0000739748 120 84 0.013672 C00022 Pyruvate 3 0.0000846602 1610 0.777097 1758 0.824276 C00124 D-Galactose 4 0.000107323 859 0.422313 2018 0.941676 C00001 H2O 5 0.000130733 663 0.338044 1991 0.928843 CE2432 trans-2-cis,cis-5,8-tetradecatrienoyl-CoA 6 0.000231587 1945 C00100 Propanoyl-CoA 7 0.000292561 C00630 2-Methylpropanoyl-CoA 8 C00149 (S)-Malate C00256 (R)-Lactate 0.92305 46 0.00477192 706 0.357157 17 0.000862806 0.000310565 893 0.443749 15 0.000663553 9 0.000548462 2084 0.978406 6 10 0.000578654 1610 0.777097 1758 0.0000793772 0.824276 Table A.18: The comparison of the top 10 reporter metabolites between ’after energy restriction vs. after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’before vs. after energy restriction (ER)’ based on the EHMN model. Recon 1 DI KEGG ID Metabolite name ER Rank P-value Rank C00214 Thymidine 1 0.00121664 1116 C00430 5-Amino-4-oxopentanoate 2 0.00207101 68 C00365 dUMP 3 0.00225587 361 C00740 D-Serine 4 0.00282043 C00671 (S)-3-Methyl-2-oxopentanoate 5 C00141 3-Methyl-2-oxobutanoate C00233 WS P-value 0.898702 Rank P-value 99 0.0490703 0.0314408 124 0.0762054 0.300767 583 0.506843 216 0.161373 158 0.105996 0.00311572 376 0.31599 345 0.287947 5 0.00311572 376 0.31599 345 0.287947 4-Methyl-2-oxopentanoate 5 0.00311572 376 0.31599 345 0.287947 C00153 Nicotinamide 6 0.00403591 379 0.319802 337 0.279185 C00108 Anthranilate 7 0.00444249 935 0.782736 27 0.0105086 C05653 N-Formylanthranilate 7 0.00444249 935 0.782736 27 0.0105086 C00427 Prostaglandin H2 8 0.0049088 113 0.0717896 130 0.0792916 C00147 Adenine 9 0.00650209 320 0.262762 242 0.181477 C00262 Hypoxanthine 9 0.00650209 419 0.350411 935 0.763497 C00294 Inosine 9 0.00650209 524 0.456885 171 0.121001 C00672 2-Deoxy-D-ribose 1-phosphate 10 0.00752422 662 0.576983 685 0.578786 Table A.19: The comparison of the top 10 reporter metabolites between ’before dietary intervention vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the Recon 1 model. ER KEGG ID Metabolite name DI Rank P-value Rank WS P-value Rank P-value C00004 Nicotinamide adenine dinucleotide - reduced 1 0.000188955 558 0.461106 54 0.0193415 C00122 Fumarate 2 0.000196622 1017 0.845368 10 0.00409946 C00003 Nicotinamide adenine dinucleotide 3 0.000201739 469 0.371783 85 0.0400579 C00149 L-Malate 4 0.000416562 1181 0.966363 33 0.0127139 tetracosahexaenoyl coenzyme A 5 0.000425598 610 0.502014 5 0.000983564 C00510 Octadecenoyl-CoA (n-C18:1CoA) 6 0.000563763 401 0.318402 4 0.000833699 C16218 trans-Octadec-2-enoyl-CoA 6 0.000563763 1027 0.854178 244 0.186712 vaccenyl coenzyme A 6 0.000563763 1027 0.854178 244 0.186712 triacylglycerol (homo sapiens) 7 0.000585207 1093 0.902535 972 0.792624 R total Coenzyme A 8 0.00117564 428 0.339875 43 0.0169446 C01181 4-Trimethylammoniobutanoate 9 0.00138316 1078 0.895337 60 0.0235543 C00266 Glycolaldehyde 10 0.00142146 230 0.166069 269 C00422 0.207325 Table A.20: The comparison of the top 10 reporter metabolites between ’before vs. after energy restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the Recon 1 model. 74 APPENDIX A. RESULTS OF ALL SELECTED DATASETS WS KEGG ID Metabolite name DI Rank P-value Rank ER P-value Rank P-value C00422 triacylglycerol (homo sapiens) 1 0.000434381 1093 0.902535 714 0.629066 C00665 D-Fructose 2,6-bisphosphate 2 0.000459701 160 0.106753 122 0.0742671 C00412 Stearoyl-CoA (n-C18:0CoA) 3 0.000772334 378 0.297258 20 C00510 Octadecenoyl-CoA (n-C18:1CoA) 4 0.000833699 401 0.318402 6 C16218 trans-Octadec-2-enoyl-CoA 4 0.000833699 1027 0.854178 41 0.0179762 vaccenyl coenzyme A 4 0.000833699 1027 0.854178 41 0.0179762 tetracosahexaenoyl coenzyme A 5 0.000983564 610 0.502014 5 C00681 lysophosphatidic acid (homo sapiens) 6 0.00173502 534 0.433646 13 C00581 Guanidinoacetate 7 0.00185154 51 C01149 4-Trimethylammoniobutanal 8 0.00303461 983 0.81522 C00671 (S)-3-Methyl-2-oxopentanoate 9 0.00357221 209 0.150242 376 0.31599 C00141 3-Methyl-2-oxobutanoate 9 0.00357221 209 0.150242 376 0.31599 C00233 4-Methyl-2-oxopentanoate 9 0.00357221 209 0.150242 376 0.31599 C00122 Fumarate 10 0.00409946 1017 0.845368 2 0.0371007 139 15 0.00530395 0.000563763 0.000425598 0.00260034 0.0892602 0.00343611 0.000196622 Table A.21: The comparison of the top 10 reporter metabolites between ’after energy restriction vs. after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’before vs. after energy restriction (ER)’ based on the Recon 1 model. 75 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.2 Expression data from human adipose tissue (GSE15773) Determination of gene expression signatures of omental and subcutaneous tissue samples [15]: - 5 insulin resistant probands - 5 insulin sensitive probands - Insulin-resistant probands and insulin-sensitive probands were paired by their Body-Mass-Index - One sample of subcutaneous and omental adipose tissue of each proband The following comparisons were applied for the calculation of the differentially expressed genes: (i) insulin resistant against insulin sensitive omental tissue and (ii) insulin resistant against insulin sensitive subcutaneous tissue. A.2.1 Differentially expressed genes The following tables show the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Insulin resistant vs. insulin sensitive omental tissue Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 54715 NM 001142333 ataxin 2-binding 1.33239845066732e-05 protein 1 NM 001142334 NM 018723 NM 145891 NM 145892 NM 145893 2 64102 tenomodulin 2.71410735293089e-05 TL132 protein 5.61926014343038e-05 ubiquitin protein 9.08837395856669e-05 NM 022144 3 220594 NR 003554 4 89910 NM 130466 ligase E3B hsa04120: Ubiquitin mediated proteolysis NM 183415 5 1844 NM 004418 dual specificity 0.000165601301054344 phosphatase 2 hsa04010: Mitogenactivated protein kinase (MAPK) signaling pathway 6 2354 NM 001114171 NM 006732 FBJ murine osteo- 0.000165704687771452 hsa04380: Osteoclast sarcoma viral differentiation oncogene homolog hsa05030: Cocaine B addiction hsa05031: Amphetamine addiction hsa05034: Alcoholism 7 8418 NR 002174 cytidine mono- 0.000169514884785619 phosphate-Nacetylneuraminic acid hydroxylase (CMP-N-acetyl neuraminate monooxygenase) pseudogene 8 2098 NM 001984 Full length insert 0.000178493462148643 cDNA clone YP41C11 76 C00469 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 9 114836 NM 052931 10 26045 NM 015564 SLAM family 0.000204923151414557 member 6 leucine rich repeat 0.000220613989325125 transmembrane neuronal 2 Table A.22: The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin sensitive omental tissue’ with the corresponding pathways and reporter metabolites. Insulin resistant vs. insulin sensitive subcutaneous tissue Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 84281 NM 001042519 hypothetical 4.42412124189397e-05 protein MGC13057 NM 001042520 NM 001042521 NM 032321 2 9096 T-box 18 9.63669135993597e-05 chromosome 12 0.000127217462105046 NM 001080508 3 80763 NM 030572 hsa04360: Axon guidance open reading frame 39 4 1948 ephrin-B2 0.000142267737510861 cyclin B1 0.000166836605789672 NM 004093 5 891 NM 031966 hsa04110: Cell cycle hsa04114: Oocyte meiosis hsa04115: p53 signaling pathway hsa04914: Progesteronemediated oocyte maturation 6 84749 NM 032663 7 ubiquitin specific 0.000193508782961701 peptidase 30 10335 murine retrovirus NM 001100163 integration site 1 NM 001100167 homolog 0.00026442219404752 hsa04270: Vascular smooth muscle contraction NM 130385 NM 001098579 8 440279 NM 001080534 9 389722 XM 927067 unc-13 homolog C 0.00030208007761224 (C. elegans) similar to cell re- hsa04721: Synaptic vesicle cycle 0.000313536251566601 cognition molecule CASPR3 10 220594 TL132 protein 0.000371949789481665 NR 003554 Table A.23: The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ with the corresponding pathways and reporter metabolites. A.2.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. 77 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Insulin resistant vs. insulin sensitive omental tissue Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value C00058 Formate 1 0.0000345065 777 0.343703 C01031 S-Formylglutathione 2 0.00415728 775 0.343151 C02934 3-Dehydrosphinganine 3 0.014462 C00083 Malonyl-CoA 4 C00051 Reduced glutathione C00064 Recon 1 model Rank P-value 2 0.00379778 4 0.00432754 97 0.0530286 36 0.0335574 0.0296929 146 0.0721345 67 0.0545762 5 0.0302962 1539 0.685389 156 0.117349 L-Glutamine 6 0.0384255 1139 0.497377 129 0.0988624 C00033 Acetate 7 0.0500906 970 0.427174 319 0.244038 C00186 L-Lactate 7 0.0500906 123 0.0603206 1007 0.784773 C00006 Nicotinamide adenine dinucleotide phosphate 8 0.050133 418 0.185274 46 0.0390321 C00005 Nicotinamide adenine dinucleotide phosphate 8 0.050133 1103 0.482451 38 0.0342608 9 0.0589676 1534 0.683081 761 0.567991 284 0.130459 680 0.50305 - reduced C00013 Diphosphate C00080 H+ 10 0.060263 Table A.24: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive omental tissue’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value C01416 Cocaine 1 0.000450126 NA NA 808 0.606428 C12448 Ecgonine methyl ester 1 0.000450126 NA NA 801 0.598147 C14818 Fe2+ 2 0.00172031 NA NA NA NA C14819 Fe3+ 2 0.00172031 NA NA 820 0.614619 C00145 Thiol 3 0.00215296 NA NA NA NA C00496 Ubiquitin 3 0.00215296 NA NA NA NA C04090 Ubiquitin C-terminal thiolester 3 0.00215296 NA NA NA NA C00346 Ethanolamine phosphate 4 0.00315727 NA NA 201 0.155935 C00029 UDPglucose 5 0.00355117 115 0.34338 445 0.343342 C00097 L-Cysteine 6 0.00366629 146 0.432218 43 0.0382238 C04419 Carboxybiotin-carboxyl-carrier protein 7 0.00455038 NA NA NA NA C06250 Holo-[carboxylase] 7 0.00455038 NA NA NA NA CE6241 S-(9-deoxy-delta12-PGD2)-glutathione 8 0.00551476 NA NA NA NA CE6243 S-(9-deoxy-delta9,12-PGD2)-glutathione 8 0.00551476 NA NA NA NA C04549 1-Phosphatidyl-1D-myo-inositol 3-phosphate 9 0.00651937 NA NA 731 0.546049 CE5132 1-phosphatidyl-myo-inositol 3,5-bisphosphate 9 0.00651937 NA NA NA NA C05959 11-epi-Prostaglandin F2alpha 10 0.00655607 NA NA NA NA C00639 Prostaglandin F2alpha 10 0.00655607 NA NA 1196 CE6244 S-(11-hydroxy-9-deoxy-delta12-PGD2)- 10 0.00655607 NA NA NA NA 10 0.00655607 NA NA NA NA 0.963089 glutathione CE6245 S-(11-OH-9-deoxy-delta9,12-PGD2)glutathione Table A.25: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive omental tissue’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value EHMN model Rank P-value C15519 25-Hydroxycholesterol 1 0.00128471 NA NA NA NA C00058 Formate 2 0.00379778 237 0.70459 777 0.343703 C19586 8,9 epxoy aflatoxin B1 3 0.00390178 NA NA NA NA C06800 aflatoxin B1 3 0.00390178 NA NA NA NA C01031 S-Formylglutathione 4 0.00432754 2 0.00415728 775 0.343151 C06423 octanoate (n-C8:0) 5 0.0047828 0.258549 NA NA C05100 3-Ureidoisobutyrate 6 0.00681573 NA NA 255 0.121417 C01205 D-3-Amino-isobutanoate 6 0.00681573 NA NA 1377 0.601485 C00001 H2O 7 0.0094336 63 0.22086 1150 0.504173 C00114 Choline 8 0.00972819 71 0.257347 639 0.282791 C00469 Ethanol 9 0.00981991 14 0.0760583 C00010 Coenzyme A 10 0.011339 73 215 0.662553 52 352 0.0307616 0.160896 Table A.26: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive omental tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 78 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Insulin resistant vs. insulin sensitive subcutaneous tissue Adipocyte model KEGG ID C04640 Metabolite name 2-(Formamido)-N1-(5-phospho-D- EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value 1 0.0051115 NA NA 5 0.00677677 1 0.0051115 NA NA 5 0.00677677 ribosyl)acetamidine C04376 N2-Formyl-N1-(5-phospho-Dribosyl)glycinamide C00311 Isocitrate 2 0.0139724 26 0.0163868 463 0.347262 C03090 5-Phospho-beta-D-ribosylamine 3 0.0194708 36 0.021962 25 0.021535 C00158 Citrate 4 0.0198505 1507 0.729585 1212 0.970628 C00013 Diphosphate 5 0.0217586 783 0.378562 997 C00279 D-Erythrose 4-phosphate 6 0.0266967 382 0.179697 32 0.0283795 C05382 Sedoheptulose 7-phosphate 6 0.0266967 797 0.386939 32 0.0283795 C02679 dodecanoate (C12:0) 7 0.0301599 152 0.0813766 701 0.544609 Eicosanoate (n-C20:0) 7 0.0301599 NA NA NA NA C00249 hexadecanoate (n-C16:0) 7 0.0301599 1516 0.733581 401 0.300038 C01530 octadecanoate (n-C18:0) 7 0.0301599 1038 0.4949 622 0.482883 pentadecanoate (C15:0) 7 0.0301599 NA NA NA NA C06424 tetradecanoate (C14:0) 7 0.0301599 152 0.0813766 252 0.198748 C00143 5,10-Methylenetetrahydrofolate 8 0.0307075 1841 C03838 N1-(5-Phospho-D-ribosyl)glycinamide 9 0.0378884 C03232 3-Phosphohydroxypyruvate 10 0.0411572 0.8083 0.885839 57 0.0452544 NA NA 55 0.0435885 500 0.222261 51 0.0415425 Table A.27: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value C01190 Glucosylceramide 1 0.000634892 NA NA 1 CE2435 trans,cis-deca-2,4-dienoyl-CoA 2 0.00345956 NA NA NA NA C00018 Pyridoxal phosphate 3 0.00392722 NA NA NA NA C00534 Pyridoxamine 3 0.00392722 NA NA 558 0.421529 C00647 Pyridoxamine phosphate 3 0.00392722 NA NA NA NA C00314 Pyridoxine 3 0.00392722 NA NA 558 0.421529 C00627 Pyridoxine phosphate 3 0.00392722 NA NA NA NA CE5787 kinetensin 1-3 4 0.00443797 NA NA NA NA C00105 UMP 5 0.00502282 242 0.733361 251 0.198241 C00145 Thiol 6 0.0052974 NA NA NA NA C00496 Ubiquitin 6 0.0052974 NA NA NA NA C04090 Ubiquitin C-terminal thiolester 6 0.0052974 NA NA NA NA C00439 N-Formimino-L-glutamate 7 0.0061186 NA NA 287 0.222142 C00026 2-Oxoglutarate 8 0.00751671 196 0.584116 331 0.253392 C00100 Propanoyl-CoA 9 0.00752986 44 0.107457 466 0.351782 G00088 (Gal)3 (Glc)1 (GlcNAc)2 (Neu5Ac)1 (Cer)1 10 0.00866419 NA NA 6 0.000661951 0.00888409 Table A.28: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ using the EHMN model in comparison to the adipocyte and Recon 1 model. 79 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Recon1 model KEGG ID Adipocyte model Metabolite name Rank P-value Rank P-value EHMN model Rank ceramide (homo sapiens) 1 0.000661951 NA NA C01190 glucocerebroside (homo sapiens) 1 0.000661951 NA NA C00083 Malonyl-CoA 2 0.000721689 57 0.129054 167 0.0879732 C00311 Isocitrate 3 0.00444189 230 0.676033 26 0.0163868 4 0.00524946 NA NA NA NA heparan sulfate, degradation product 16 4 0.00524946 NA NA NA NA heparan sulfate, degradation product 22 4 0.00524946 NA NA NA NA 2-(Formamido)-N1-(5-phospho-D- 5 0.00677677 1 0.0051115 NA NA 5 0.00677677 1 0.0051115 NA NA 10 0.00866419 chondroitin sulfate B / dermatan sulfate 684 P-value C00195 0.32046 1 0.000634892 (IdoA2S-GalNAc4S), degradation product 4 C04640 ribosyl)acetamidine C04376 N2-Formyl-N1-(5-phospho-Dribosyl)glycinamide G00088 VI3NeuAc-nLc6Cer 6 0.00888409 NA NA C00013 Diphosphate 7 0.00986705 280 0.866302 783 0.378562 C00318 L-Carnitine 8 0.0102979 55 0.122451 320 0.153054 2-Decaprenyl-3-methyl-5-hydroxy-6-methoxy- 9 0.010683 NA NA NA NA 0.0107119 242 0.733361 1,4-benzoquinone C00105 UMP 10 5 0.00502282 Table A.29: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN model. A.2.3 Comparison of expression data The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model. Adipocyte omental KEGG ID subcutaneous Metabolite name Rank P-value Rank P-value C00058 Formate 1 0.0000345065 249 0.762469 C01031 S-Formylglutathione 2 0.00415728 108 0.286822 C02934 3-Dehydrosphinganine 3 0.014462 28 0.0758283 C00083 Malonyl-CoA 4 0.0296929 57 0.129054 C00051 Reduced glutathione 5 0.0302962 132 0.343607 C00064 L-Glutamine 6 0.0384255 247 0.750823 C00033 Acetate 7 0.0500906 262 0.800853 C00186 L-Lactate 7 0.0500906 17 0.0548723 C00006 Nicotinamide adenine dinucleotide phosphate 8 0.050133 16 0.0539618 C00005 Nicotinamide adenine dinucleotide phosphate - reduced 8 0.050133 16 0.0539618 C00013 Diphosphate 9 0.0589676 280 0.866302 C00080 H+ 0.060263 205 0.59862 10 Table A.30: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the adipocyte model. 80 APPENDIX A. RESULTS OF ALL SELECTED DATASETS subcutaneous KEGG ID omental Metabolite name Rank P-value Rank P-value C04640 2-(Formamido)-N1-(5-phospho-D-ribosyl)acetamidine 1 0.0051115 149 0.43856 C04376 N2-Formyl-N1-(5-phospho-D-ribosyl)glycinamide 1 0.0051115 149 0.43856 C00311 Isocitrate 2 0.0139724 29 0.121157 C03090 5-Phospho-beta-D-ribosylamine 3 0.0194708 30 0.121373 C00158 Citrate 4 0.0198505 299 0.954296 C00013 Diphosphate 5 0.0217586 261 0.798916 C00279 D-Erythrose 4-phosphate 6 0.0266967 24 0.109783 C05382 Sedoheptulose 7-phosphate 6 0.0266967 24 0.109783 C02679 dodecanoate (C12:0) 7 0.0301599 56 0.206137 Eicosanoate (n-C20:0) 7 0.0301599 56 0.206137 C00249 hexadecanoate (n-C16:0) 7 0.0301599 56 0.206137 C01530 octadecanoate (n-C18:0) 7 0.0301599 91 0.292422 pentadecanoate (C15:0) 7 0.0301599 56 0.206137 C06424 tetradecanoate (C14:0) 7 0.0301599 56 0.206137 C00143 5,10-Methylenetetrahydrofolate 8 0.0307075 168 0.485562 C03838 N1-(5-Phospho-D-ribosyl)glycinamide 9 0.0378884 97 C03232 3-Phosphohydroxypyruvate 10 0.0411572 127 0.30705 0.376986 Table A.31: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the adipocyte model. EHMN omental KEGG ID subcutaneous Metabolite name Rank P-value Rank P-value C01416 Cocaine 1 0.000450126 1657 0.804592 C12448 Ecgonine methyl ester 1 0.000450126 1657 0.804592 C14818 Fe2+ 2 0.00172031 276 0.131285 C14819 Fe3+ 2 0.00172031 714 0.336101 C00145 Thiol 3 0.00215296 602 0.276338 C00496 Ubiquitin 3 0.00215296 1368 0.655487 C04090 Ubiquitin C-terminal thiolester 3 0.00215296 602 0.276338 C00346 Ethanolamine phosphate 4 0.00315727 116 0.0621875 C00029 UDPglucose 5 0.00355117 100 0.0517659 C00097 L-Cysteine 6 0.00366629 934 0.447135 C04419 Carboxybiotin-carboxyl-carrier protein 7 0.00455038 701 0.331411 C06250 Holo-[carboxylase] 7 0.00455038 76 CE6241 S-(9-deoxy-delta12-PGD2)-glutathione 8 0.00551476 1219 0.577494 CE6243 S-(9-deoxy-delta9,12-PGD2)-glutathione 8 0.00551476 1219 0.577494 C04549 1-Phosphatidyl-1D-myo-inositol 3-phosphate 9 0.00651937 1235 0.584706 CE5132 1-phosphatidyl-myo-inositol 3,5-bisphosphate 9 0.00651937 1846 0.889204 C05959 11-epi-Prostaglandin F2alpha 10 0.00655607 881 C00639 Prostaglandin F2alpha 10 0.00655607 1237 CE6244 S-(11-hydroxy-9-deoxy-delta12-PGD2)-glutathione 10 0.00655607 881 0.42114 CE6245 S-(11-OH-9-deoxy-delta9,12-PGD2)-glutathione 10 0.00655607 881 0.42114 0.0410101 0.42114 0.585272 Table A.32: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the EHMN model. 81 APPENDIX A. RESULTS OF ALL SELECTED DATASETS subcutaneous KEGG ID omental Metabolite name Rank P-value C01190 Glucosylceramide 1 0.000634892 CE2435 trans,cis-deca-2,4-dienoyl-CoA 2 C00018 Pyridoxal phosphate 3 C00534 Pyridoxamine C00647 Rank P-value 37 0.0207825 0.00345956 68 0.0373244 0.00392722 1064 0.466204 3 0.00392722 1064 0.466204 Pyridoxamine phosphate 3 0.00392722 1064 0.466204 C00314 Pyridoxine 3 0.00392722 1064 0.466204 C00627 Pyridoxine phosphate 3 0.00392722 1064 0.466204 CE5787 kinetensin 1-3 4 0.00443797 154 0.0758005 C00105 UMP 5 0.00502282 959 0.422072 C00145 Thiol 6 0.0052974 1128 0.491042 C00496 Ubiquitin 6 0.0052974 640 0.283103 C04090 Ubiquitin C-terminal thiolester 6 0.0052974 1128 0.491042 C00439 N-Formimino-L-glutamate 7 0.0061186 200 0.0966571 C00026 2-Oxoglutarate 8 0.00751671 574 0.254968 C00100 Propanoyl-CoA 9 0.00752986 1224 0.533772 G00088 (Gal)3 (Glc)1 (GlcNAc)2 (Neu5Ac)1 (Cer)1 10 0.00866419 726 0.323394 Table A.33: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the EHMN model. Recon 1 omental KEGG ID subcutaneous Metabolite name Rank P-value Rank P-value C15519 25-Hydroxycholesterol 1 0.00128471 233 0.186681 C00058 Formate 2 0.00379778 239 0.191749 C19586 8,9 epxoy aflatoxin B1 3 0.00390178 763 0.604782 C06800 aflatoxin B1 3 0.00390178 763 0.604782 C01031 S-Formylglutathione 4 0.00432754 325 0.244277 C06423 octanoate (n-C8:0) 5 0.0047828 C05100 3-Ureidoisobutyrate 6 0.00681573 235 0.188754 C01205 D-3-Amino-isobutanoate 6 0.00681573 235 0.188754 C00001 H2O 7 0.0094336 326 0.24433 C00114 Choline 8 0.00972819 859 0.695636 C00469 Ethanol 9 0.00981991 1188 0.95315 C00010 Coenzyme A 599 0.46389 10 0.011339 66 0.0521972 Table A.34: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the Recon 1 model. subcutaneous KEGG ID omental Metabolite name Rank P-value Rank P-value C00195 ceramide (homo sapiens) 1 0.000661951 22 0.0215455 C01190 glucocerebroside (homo sapiens) 1 0.000661951 22 0.0215455 C00083 Malonyl-CoA 2 0.000721689 67 0.0545762 C00311 Isocitrate 3 0.00444189 93 0.0726228 chondroitin sulfate B / dermatan sulfate (IdoA2S-GalNAc4S), degradation 4 0.00524946 293 0.21864 heparan sulfate, degradation product 16 4 0.00524946 293 0.21864 heparan sulfate, degradation product 22 4 0.00524946 293 0.21864 C04640 2-(Formamido)-N1-(5-phospho-D-ribosyl)acetamidine 5 0.00677677 625 0.466608 C04376 N2-Formyl-N1-(5-phospho-D-ribosyl)glycinamide 5 0.00677677 625 0.466608 G00088 VI3NeuAc-nLc6Cer 6 0.00888409 431 0.332104 C00013 Diphosphate 7 0.00986705 761 0.567991 C00318 L-Carnitine 8 0.0102979 515 0.396884 2-Decaprenyl-3-methyl-5-hydroxy-6-methoxy-1,4-benzoquinone 9 0.010683 918 0.703013 0.0107119 395 0.305826 product 3 C00105 UMP 10 Table A.35: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the Recon 1 model. 82 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.3 Genome-wide analysis of adipose tissue gene expression in twin-pairs discordant for physical activity for over 30 years (GSE20536) Biopsy samples of adipose tissue from twin pairs that had been followed for their discordance for physical activity for 32 years [91]: - Two mono- and four dizygotic twins - Paired sample per twin pair The comparison of the twin pairs, active against non-active, is used for the calculation of the differentially expressed genes. A.3.1 Differentially expressed genes The following table shows the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Active vs. non-active Rank EntrezID GeneName P-value Pathway 2.50889117574308e-05 hsa04610: Complement and RefSeqID 1 2162 NM 000129.2 Homo sapiens coagulation factor coagulation cascades XIII,A1 polypeptide (F13A1), mRNA. 2 55154 NM 018116.2 Homo sapiens 6.80124299891444e-05 misato homolog 1 (Drosophila) (MSTO1), mRNA. 3 6584 Homo sapiens so- NM 003060.2 lute carrier family 0.000207549280179673 22 (organic cation transporter), member 5 (SLC22A5), mRNA. 4 440349 PREDICTED: XM 496129.2 Homo sapiens 0.00021302455892714 similar to nuclear pore complex interacting protein, transcript variant 1 (LOC 440349), mRNA. 5 9270 NM 004763.2 Homo sapiens 0.000320744391026067 integrin beta 1 binding protein 1 (ITGB1BP1), transcript variant 1, mRNA. 6 81493 NM 030786.1 Homo sapiens 0.000422488677668484 syncoilin, intermediate filament 1 (SYNC1), mRNA. 83 Adipocyte EHMN Recon1 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 7 2745 NM 002064.1 Homo sapiens 0.000488692163262734 glutaredoxin (thioltransferase) (GLRX), mRNA. 8 57129 Homo sapiens NM 020409.2 mitochondrial 0.000533011668188304 ribosomal protein L47 (MRPL47), nuclear gene encoding mitochondrial protein, transcript variant 1, mRNA. 9 6892 NM 172209.1 Homo sapiens TAP 0.000575535751947747 hsa04612: Antigen pro- binding protein cessing and presentation (tapasin),(TAPBP) transcript variant 3, mRNA. 10 10163 Homo sapiens NM 006990.2 WAS protein 0.000642954970236917 hsa04520: Adherens junction family, member 2 hsa04666: Fc gamma R- (WASF2), mRNA. mediated phagocytosis hsa04810: Regulation of actin cytoskeleton ha05100: Bacterial invasion of epithelial cells hsa05131: Shigellosis hsa05132: Salmonella infection Table A.36: The top 10 differentially expressed genes from the comparison ’active vs. non-active’ with the corresponding pathways and reporter metabolites. A.3.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. Active vs. non-active Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value C00008 ADP 1 0.00167407 523 0.228823 324 0.276925 C00016 FAD 2 0.0146813 597 0.248455 117 0.111056 C01352 FADH2 2 0.0146813 808 0.336281 117 0.111056 C04895 2-Amino-4-hydroxy-6-(erythro-1,2,3- 3 0.0164251 274 0.121031 24 0.0651738 trihydroxypropyl)dihydropteridine 0.0261545 triphos- phate C00002 ATP 4 0.0197332 136 105 0.100173 C00091 Succinyl-CoA 5 0.0277063 1637 0.741003 95 0.092701 C00440 5-Methyltetrahydrofolate 6 0.0285175 1945 0.902636 694 0.581199 C00042 Succinate 7 0.0313222 780 0.327249 644 0.538272 C01236 6-phospho-D-glucono-1,5-lactone 8 0.0423082 111 0.0577854 205 0.182101 C00164 Acetoacetate 9 0.048282 236 0.107325 81 C03912 1-Pyrroline-5-carboxylate 10 0.0487278 225 0.103803 123 0.11462 C00148 L-Proline 10 0.0487278 2048 0.95538 661 0.550038 0.0838792 Table A.37: The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the adipocyte model in comparison to the EHMN and Recon 1 model. 84 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank 1237 P-value C00047 L-Lysine 1 0.00196101 NA NA 0.992782 C01094 D-Fructose 1-phosphate 2 0.00262318 NA NA 5 C00518 Hyaluronate 3 0.00534445 NA NA 916 0.759486 C00167 UDPglucuronate 3 0.00534445 NA NA 912 0.755742 C03391 DNA 6-methylaminopurine 4 0.00566266 NA NA NA NA C00821 DNA adenine 4 0.00566266 NA NA NA NA C06893 2-Deoxy-5-keto-D-gluconic acid 6-phosphate 5 0.00709321 NA NA NA NA C00222 3-Oxopropanoate 5 0.00709321 NA NA 337 0.286842 CE1186 D-xylulose-1-phosphate 5 0.00709321 NA NA NA NA C00111 Glycerone phosphate 5 0.00709321 72 0.266625 67 0.0720443 C00266 Glycolaldehyde 5 0.00709321 NA NA 287 0.249745 CE2054 20-carboxy-leukotriene-B4 6 0.00754237 NA NA NA NA C00301 ADPribose 7 0.00838736 NA NA 57 0.0608186 C00310 D-Xylulose 8 0.00843894 NA NA 143 0.126706 C00318 L-Carnitine 9 0.00858839 288 0.970025 614 0.51831 C02839 L-Tyrosyl-tRNA(Tyr) 10 0.0106306 NA NA NA NA C00787 tRNA(Tyr) 10 0.0106306 NA NA NA NA 0.00302379 Table A.38: The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value EHMN model Rank P-value D-Tagatose 1-phosphate 1 0.000367843 NA NA NA NA C00072 L-Ascorbate 2 0.00095762 NA NA 855 0.363183 C00584 Prostaglandin E2 3 0.00236753 NA NA 207 0.0934877 C00577 D-Glyceraldehyde 4 0.00263828 NA NA 96 0.0506967 C01094 D-Fructose 1-phosphate 5 0.00302379 NA NA 2 D-Xylulose 1-phosphate 5 0.00302379 NA NA NA hyaluronan 6 0.00591994 NA NA 3 R total 2 coenzyme A 7 0.00671575 NA NA NA NA pristanic acid 8 0.00852173 NA NA NA NA C00795 D-Tagatose 9 0.0092775 NA NA NA NA C00301 ADPribose 10 0.00945689 NA NA 109 0.0571841 C00518 0.00262318 NA 0.00534445 Table A.39: The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 85 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.4 Differences in subcutaneous adipose tissue gene expression between obese African Americans and Hispanic Youths (GSE23506) Cross-sectional study design to compare subcutaneous adipose tissue gene expression profiles [92]: - 17 African Americans - 19 Hispanics The comparison of the African Americans against Hispanics is used for the calculation of the differentially expressed genes. A.4.1 Differentially expressed genes The following table shows the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. African Americans vs. Hispanics Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 6228 NM 001025.4 Homo sapiens 4.66501824511915e-07 hsa03010: Ribosome ribosomal protein S23 (RPS23), mRNA. 2 642934 PREDICTED: XM 942991.2 Homo sapiens 5.37782222890024e-07 hypothetical LOC642934 (LOC642934), mRNA. 3 6428 NM 003017.3 Homo sapiens 4.1302884682886e-05 hsa03040: Spliceosome splicing factor, hsa05168: Herpes simplex arginine/serine- infection rich 3 (SFRS3), mRNA. 4 10901 NM 021004.2 Homo sapiens 4.68304117637516e-05 hsa00830: Retinol dehydrogenase/ metabolism reductase (SDR hsa01100: Metabolic C00011 C00191 C00025 family) member 4 pathways C00091 C00584 C00052 C00214 C00639 C00058 C00337 C00641 C00209 C00363 C02165 C00427 C00364 C05455 C00581 (DHRS4), mRNA. C00864 C01194 C01346 C05635 C05951 hsa04146: Peroxisome 5 6231 NM 001029.3 Homo sapiens 5.1153372883841e-05 ribosomal protein S26 (RPS26), mRNA. 86 hsa03010: Ribosome APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 6 222967 NM 173565.1 Homo sapiens 5.82154912734042e-05 hypothetical protein LOC222967 (LOC222967), mRNA. 7 26254 NM 014359.3 Homo sapiens 6.85759996704967e-05 opticin (OPTC), mRNA. 8 650646 PREDICTED: XM 942527.2 Homo sapiens 9.7940554821229e-05 similar to 40S ribosomal protein S26 (LOC650646), mRNA. 9 84632 NM 032550.2 Homo sapiens 0.000104322064758533 actin filament associated protein1like 2 (AFAP1L2), transcript variant 2, mRNA. 10 6710 Homo sapiens NM 001024858.1 spectrin, beta, 0.000106741579871336 erythrocytic (includes spherocytosis, clinical type I) (SPTB), transcript variant 1, mRNA. Table A.40: The top 10 differentially expressed genes from the comparison ’African Americans vs. Hispanics’ with the corresponding pathways and reporter metabolites. A.4.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. African Americans vs. Hispanics Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank 1211 P-value 0.550168 Recon 1 model Rank CO2 1 0.00955117 C00288 Bicarbonate 2 0.0100439 70 8 0.0102825 C00337 (S)-Dihydroorotate 3 0.0195315 290 0.142493 46 0.0372433 (S)-Methylmalonate semialdehyde 4 0.0201632 470 0.211569 NA NA Methylmalonate 4 0.0201632 470 0.211569 NA NA C00864 (R)-Pantothenate 5 0.0220845 73 0.0358939 44 0.0359254 C01346 dUDP 6 0.0245557 1152 0.521576 708 0.539504 C00364 dTMP 7 0.0292051 1696 0.795349 121 0.0995272 C00363 dTDP 8 0.0352166 1152 0.521576 392 0.299371 C00214 Thymidine 9 0.0368675 1783 0.837454 409 0.311549 C00091 Succinyl-CoA 10 0.0410171 927 0.41604 489 0.366924 0.0349696 353 P-value C00011 0.265944 Table A.41: The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using the adipocyte model in comparison to the EHMN and Recon 1 model. 87 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value CE4988 10,11-dihydro-12-epi-leukotriene B4 1 0.000235915 NA NA NA NA CE5944 10,11-dihydro-12-oxo-LTB4 1 0.000235915 NA NA NA NA CE4993 10,11-dihydro-12R-hydroxy-leukotriene C4 1 0.000235915 NA NA NA NA CE5976 12-oxo-10,11-dihydro-20-COOH-LTB4 1 0.000235915 NA NA NA NA CE5525 12-oxo-20-carboxy-leukotriene B4 1 0.000235915 NA NA NA NA CE5947 20-COOH-10,11-dihydro-LTB4 1 0.000235915 NA NA NA NA C04805 5(S)-HETE 1 0.000235915 NA NA NA NA CE6246 5,12-DiHETE 1 0.000235915 NA NA NA NA CE7085 5-HEPE 1 0.000235915 NA NA NA NA CE2084 5-oxo-(6E,8Z,11Z,14Z)-eicosatetraenoic acid 1 0.000235915 NA NA NA NA CE7097 5-oxo-12(S)-hydroxy-eicosa-6E,8Z,10E,14Z- 1 0.000235915 NA NA NA NA tetraenoate CE5178 5-oxo-6-trans-leukotriene B4 1 0.000235915 NA NA NA NA CE5349 5-oxo-6E-12-epi-LTB4 1 0.000235915 NA NA NA NA CE7111 5-oxo-EPE 1 0.000235915 NA NA NA NA CE5350 6,7-dihydro-12-epi-LTB4 1 0.000235915 NA NA NA NA CE5352 6,7-dihydro-leukotriene B4 1 0.000235915 NA NA NA NA CE2445 6-trans-leukotriene B4 1 0.000235915 NA NA NA NA CE2446 6E-12-epi-LTB4 1 0.000235915 NA NA NA NA C02165 Leukotriene B4 1 0.000235915 NA NA 510 0.387886 C00639 Prostaglandin F2alpha 1 0.000235915 NA NA 133 0.104798 CE5531 12-oxo-c-LTB3 2 0.00220967 NA NA NA NA CE4990 12-oxo-leukotriene B4 2 0.00220967 NA NA NA NA C03512 L-Tryptophanyl-tRNA(Trp) 3 0.00245117 NA NA NA NA C01652 tRNA(Trp) 3 0.00245117 NA NA NA NA C00584 Prostaglandin E2 4 0.00296839 NA NA 102 0.0889991 C05457 7alpha,12alpha-Dihydroxycholest-4-en-3-one 5 0.00330511 NA NA 15 0.0131415 C05455 7alpha-Hydroxycholest-4-en-3-one 5 0.00330511 NA NA 63 0.051964 C13856 2-Arachidonoylglycerol 6 0.00342886 NA NA NA NA CE4987 10,11-dihydro-leukotriene B4 7 0.00347994 NA NA NA NA CE2054 20-carboxy-leukotriene-B4 7 0.00347994 NA NA NA NA CE5343 6,7-dihydro-5-oxo-12-epi-LTB4 7 0.00347994 NA NA NA NA CE5179 6,7-dihydro-5-oxo-leukotriene B4 7 0.00347994 NA NA NA NA C00191 D-Glucuronate 8 0.00378807 NA NA 68 0.0539488 C00641 1,2-Diacyl-sn-glycerol 9 0.00455281 NA NA NA NA C00066 tRNA 10 0.00570316 NA NA NA NA Table A.42: The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value EHMN model Rank 815 P-value C00209 Oxalate 1 0.00146939 NA NA C00581 Guanidinoacetate 2 0.00511799 NA NA 17 0.00943231 C06196 dIMP 3 0.00536709 NA NA NA NA C00025 L-Glutamate 4 0.00625135 223 0.792349 1860 0.873716 C05951 Leukotriene D4 4 0.00625135 NA NA 1733 0.814375 C00052 UDPgalactose 5 0.0075734 NA NA 229 0.114645 C00288 Bicarbonate 6 0.0101064 233 0.833987 C05635 5-Hydroxyindoleacetate 7 0.0103664 NA NA 1646 0.768745 C00427 Prostaglandin H2 8 0.0105947 NA NA 774 0.345731 C01194 phosphatidylinositol (homo sapiens) 9 0.0108149 NA NA 1964 0.913392 C00058 Formate 10 0.0121275 181 0.637771 70 130 0.360454 0.0349696 0.0608291 Table A.43: The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 88 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.5 Subcutaneous adipose tissue: comparison of weight maintenance and weight regain following an 8-week low calorie diet (GSE24432) Fourty women followed a dietary protocol consisting of an 8-week low calorie diet (LCD) and a 6-month weight maintenance phase [93]: - 20 probands were classified as weight maintainers (WM) - 20 probands were classified as weight regainers (WR) - 2 paired samples per person, one before and one after LCD The following comparisons were applied for the calculation of the differentially expressed genes: (i) weight maintenance - before low calorie diet vs. after low calorie diet and (ii) weight regainer before low calorie diet vs. after low calorie diet. A.5.1 Differentially expressed genes The following tables show the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Weight maintenance - before low calorie diet vs. after low calorie diet Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 C00510 C00154 C00510 C00412 C03035 C00510 C16163 RefSeqID 1 9415 NM 004265 fatty acid 1.11927141264039e-14 desaturase 2 hsa00592: alpha-Linolenic acid metabolism hsa01040: Biosynthesis of unsaturated fatty acids C02050 C02249 hsa03320: PPAR signaling pathway 2 10614 NM 006460 hexamethylene 1.73814074790391e-13 bisacetamide inducible 1 3 54518 amyloid beta (A4) NM 019043 precursor protein- 1.51774826561344e-12 binding, family B, member 1 interacting protein 4 2876 NM 201397 glutathione 7.09742698715859e-12 peroxidase 1 hsa00480: Glutathione metabolism hsa00590: Arachidonic acid C04742 metabolism C04805 C05356 C05965 C05966 hsa05014: Amyotrophic lateral sclerosis (ALS) hsa05016: Huntington’s disease 89 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 C00510 C00154 C00510 C00412 C03035 C00510 C16163 RefSeqID 5 6319 NM 005063 stearoyl-CoA 1.07987914784104e-11 desaturase (delta- hsa01040: Biosynthesis of unsaturated fatty acids 9-desaturase) C02050 C02249 hsa03320: PPAR signaling pathway 6 60481 NM 021814 ELOVL fatty acid 2.34361061109303e-11 elongase 5 hsa00062: Fatty acid C05272 elongation hsa01040: Biosynthesis of C00154 C05272 C01944 C00510 unsaturated fatty acids C00154 C00510 C00412 C03035 C00510 C16163 C02050 C02249 7 8 25878 matrix-remodell- NM 015419 ing associated 5 7280 tubulin, beta 2A NM 001069 2.6687478989018e-11 3.06228306097627e-11 sa04145: Phagosome C00007 class IIa hsa04540: Gap junction hsa05130: Pathogenic Escherichia coli infection 9 23531 monocyte to NM 012329 macrophage 1.49523954543299e-10 differentiationassociated 10 493 NM 001001396 ATPase, Ca++ 1.67674757732865e-10 transporting, hsa04020: Calcium C00004 signaling pathway plasma membrane 4 hsa04970: Salivary secretion hsa04972: Pancreatic secretion Table A.44: The top 10 differentially expressed genes from the comparison ’WM - before LCD vs. after LCD’ with the corresponding pathways and reporter metabolites. Weight regainer - before low calorie diet vs. after low calorie diet Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 hsa01040: Biosynthesis of C00154 C00510 unsaturated fatty acids C00412 RefSeqID 1 9415 NM 004265 fatty acid 2.64051773458427e-11 desaturase 2 hsa00592: alpha-Linolenic acid metabolism C00510 C02050 C02249 C03035 C06426 hsa03320: PPAR signaling pathway 2 10957 NM 006813 proline-rich nuc- 3.3207161581569e-11 lear receptor coactivator 1 3 3183 NM 031314 heterogeneous 6.77857173789559e-11 nuclear ribonucleoprotein C (C1/C2) 90 hsa03040: Spliceosome APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 hsa01040: Biosynthesis of C00154 C00510 unsaturated fatty acids C00412 RefSeqID 4 6319 NM 005063 stearoyl-CoA 1.34364957647302e-10 desaturase (delta9-desaturase) C00510 C02050 C02249 C03035 C06426 hsa03320: PPAR signaling pathway 5 6202 NM 001012 6 4869 NM 002520 ribosomal protein 2.69976568398775e-10 hsa03010: Ribosome S8 nucleophosmin 3.28854625070913e-10 (nucleolar phosphoprotein B23, numatrin) 7 124 NM 000667 alcohol dehydro- 3.47714796918351e-10 hsa00010: Glycolysis/ genase 1A (class I), Gluconeogenesis alpha polypeptide hsa00071: Fatty acid C00154 metabolism hsa00350: Tyrosine C05576 metabolism hsa00830: Retinol metabolism hsa00982: Drug metabolism 8 51429 sorting nexin 9 1.96217224619829e-09 iron-sulfur cluster 2.6234021276621e-09 hsa01100: Metabolic C00003 C00154 C00003 pathways C00004 C00222 C00004 C00007 C00447 C00051 C00129 C00577 C00072 C00155 C01094 C00132 C00235 C06426 C00577 C00341 C01094 C00448 C05576 NM 016224 9 23479 NM 014301 scaffold homolog (E. coli) 10 9775 NM 014740 eukaryotic trans- 2.70101470848801e-09 hsa03013: RNA transport lation initiation hsa03015: mRNA factor 4A3 surveillance pathway hsa03040: Spliceosome Table A.45: The top 10 differentially expressed genes from the comparison ’WR - before LCD vs. after LCD’ with the corresponding pathways and reporter metabolites. 91 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.5.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. Weight maintenance - before low calorie diet vs. after low calorie diet Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value octadecadienoyl-CoA (C18:2CoA, n-6) 1 0.00000292448 NA NA NA NA eicosadienoyl-CoA (C20:2CoA, n-6) 2 0.0000625222 NA NA NA NA octadecatrienoyl-CoA (C18:3CoA, n-3) 3 0.0000704046 NA NA NA NA O2 4 0.000162911 1223 0.692827 486 0.478058 octadecatrienoyl-CoA (C18:3CoA, n-6) 5 0.000181388 NA NA NA NA stearidonyl coenzyme A (C18:4CoA, n-3) 5 0.000181388 NA NA NA NA Coenzyme A 6 0.000904182 184 0.0733702 360 0.35624 eicosatetraenoyl-CoA (C20:4CoA, n-6) 7 0.00144942 NA NA NA NA eicosatrienoyl-CoA (C20:3CoA, n-6) 7 0.00144942 NA NA NA NA heptadecenoyl CoA (C17:1CoA, n-8) 8 0.00267006 NA NA NA NA eicosenoyl-CoA (C20:1CoA, n-11) 9 0.00291506 NA NA NA NA tetradecenoyl-CoA (C14:1CoA, n-5) 9 0.00291506 NA NA NA NA docosenoyl-CoA (C22:1CoA, n-9) 10 0.00321812 NA NA NA NA C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 10 0.00321812 80 0.0202188 588 0.570928 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 10 0.00321812 79 0.0193181 1 C00007 C00010 0.00000502544 Table A.46: The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value C02249 Arachidonyl-CoA 1 0.00000324492 NA NA 227 0.23024 C01794 Choloyl-CoA 2 0.0000594551 NA NA 774 0.723132 CE2254 docosanoyl-CoA 2 0.0000594551 NA NA NA NA C00412 Stearoyl-CoA 2 0.0000594551 12 0.0137419 14 0.00383382 CE2257 tetracosanoyl-CoA 2 0.0000594551 NA NA NA NA C02050 Linoleoyl-CoA 3 0.0000701231 NA NA NA NA C00510 Oleoyl-CoA 4 0.000318929 12 0.0137419 1 C00154 Palmitoyl-CoA 4 0.000318929 14 0.0147991 52 0.0329838 C05965 12(S)-HPETE 5 0.000330223 NA NA NA NA C04717 13(S)-HPODE 5 0.000330223 NA NA NA NA CE2163 13-hydroxy-(9Z,11E)-octadecadienoate 5 0.000330223 NA NA NA NA C04742 15(S)-HETE 5 0.000330223 NA NA NA NA C05966 15(S)-HPETE 5 0.000330223 NA NA NA NA C04805 5(S)-HETE 5 0.000330223 NA NA NA NA C05356 5(S)-HPETE 5 0.000330223 NA NA 853 0.796538 C14827 9(S)-HPODE 5 0.000330223 NA NA NA NA CE2539 9-hydroxyoctadecadienoate 5 0.000330223 NA NA NA NA CE0852 palmitoleoyl-CoA 6 0.000335923 NA NA NA NA C03069 3-Methylcrotonyl-CoA 7 0.000899727 53 0.167654 21 0.00930988 CE0713 3-oxolinoleoyl-CoA 8 0.0011461 NA NA NA NA C00332 Acetoacetyl-CoA 9 0.00153217 231 0.824807 338 0.335567 C01944 Octanoyl-CoA 9 0.00153217 92 0.312845 255 0.252494 C00091 Succinyl-CoA 9 0.00153217 76 0.257895 289 0.282306 C00016 FAD 10 0.00159844 248 0.873929 365 0.359982 C01352 FADH2 10 0.00159844 248 0.873929 365 0.359982 0.00000502544 Table A.47: The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’ using the EHMN model in comparison to the adipocyte and Recon 1 model. 92 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank C00510 Octadecenoyl-CoA (n-C18:1CoA) 1 0.00000502544 12 C16218 trans-Octadec-2-enoyl-CoA 1 0.00000502544 vaccenyl coenzyme A 1 0.00000502544 tetracosahexaenoyl coenzyme A 2 Coenzyme A P-value EHMN model Rank P-value 0.0137419 79 0.0193181 NA NA NA NA NA NA NA NA 0.000033348 NA NA NA NA 3 0.000246448 6 0.000904182 184 0.0733702 alpha-Linolenoyl-CoA 4 0.000370386 NA NA 167 0.065261 gamma-linolenoyl-CoA 4 0.000370386 NA NA 38 0.00776122 linoelaidyl coenzyme A 4 0.000370386 NA NA NA NA linoleic coenzyme A 4 0.000370386 NA NA NA NA stearidonyl coenzyme A 4 0.000370386 NA NA NA NA tetracosapentaenoyl coenzyme A, n-3 4 0.000370386 NA NA NA NA C05272 Hexadecenoyl-CoA (n-C16:1CoA) 5 0.00113402 10 0.00321812 80 0.0202188 C00003 Nicotinamide adenine dinucleotide 6 0.00127546 33 0.0961396 191 0.0769088 C00422 triacylglycerol (homo sapiens) 7 0.00127821 NA NA 639 0.361776 C00004 Nicotinamide adenine dinucleotide - reduced 8 0.0014729 0.0961396 174 0.0684839 C00268 6,7-Dihydrobiopterin 9 0.00157729 NA NA 112 0.0386679 4-Nitrophenyl sulfate 10 0.0019562 NA NA NA NA Dopamine 3-O-sulfate 10 0.0019562 NA NA NA NA C00010 C03035 C16163 C13690 33 Table A.48: The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’ using the Recon 1 model in comparison to the adipocyte and EHMN model. Weight regainer - before low calorie diet vs. after low calorie diet Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value octadecatrienoyl-CoA (C18:3CoA, n-6) 1 0.000234023 NA NA NA NA stearidonyl coenzyme A (C18:4CoA, n-3) 1 0.000234023 NA NA NA NA eicosapentaenoyl-CoA (C20:5CoA, n-3) 2 0.00156232 NA NA NA NA eicosatetraenoyl-CoA (C20:4CoA, n-3) 2 0.00156232 NA NA NA NA octadecatrienoyl-CoA (C18:3CoA, n-3) 3 0.00735661 NA NA NA NA L-Homocysteine 4 0.00774941 38 0.0182404 104 0.0866209 S-(Hydroxymethyl)glutathione 5 0.00786018 14 0.00596045 NA NA O2 6 0.0144094 21 0.0102723 362 0.328023 eicosatetraenoyl-CoA (C20:4CoA, n-6) 7 0.0167658 NA NA NA NA eicosatrienoyl-CoA (C20:3CoA, n-6) 7 0.0167658 NA NA NA NA C00003 Nicotinamide adenine dinucleotide 8 0.0171684 241 0.107103 2 0.0000982315 C00004 Nicotinamide adenine dinucleotide - reduced 8 0.0171684 253 0.113878 1 0.0000892173 C00235 Dimethylallyl diphosphate 9 0.0194827 1728 0.962407 415 0.382652 C00448 Farnesyl diphosphate 9 0.0194827 618 0.304821 332 0.300464 C00341 Geranyl diphosphate 9 0.0194827 1296 0.721519 33 C00129 Isopentenyl diphosphate 9 0.0194827 1728 0.962407 482 0.460095 10 0.0226904 NA NA NA NA C00155 C00007 octadecadienoyl-CoA (C18:2CoA, n-6) 0.0163746 Table A.49: The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using the adipocyte model in comparison to the EHMN and Recon 1 model. 93 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value NA Recon 1 model Rank NA P-value C02050 Linoleoyl-CoA 1 0.000133458 NA NA C00510 Oleoyl-CoA 2 0.000296351 66 0.269027 4 C00154 Palmitoyl-CoA 2 0.000296351 64 0.253214 203 0.186697 C00412 Stearoyl-CoA 2 0.000296351 66 0.269027 111 0.0907745 C01094 D-Fructose 1-phosphate 3 0.000596177 NA NA 6 0.000684636 C00577 D-Glyceraldehyde 4 0.00104135 NA NA 5 0.000678521 C06426 (6Z,9Z,12Z)-Octadecatrienoic acid 5 0.00111288 NA NA 621 0.577515 C03035 gamma-Linolenoyl-CoA 5 0.00111288 NA NA 529 0.503971 CE4815 stearidonoyl-CoA 5 0.00111288 NA NA NA NA CE4824 tetracosa-9,12,15,18,21-all-cis-pentaenoyl- 5 0.00111288 NA NA NA NA 0.00061122 CoA CE4837 tetracosa-9,12,15,18-all-cis-tetraenoyl-CoA 5 0.00111288 NA NA NA NA C02249 Arachidonyl-CoA 6 0.00121838 NA NA 101 0.0836376 CE4809 alpha-linolenoyl-CoA 7 0.00203885 NA NA NA NA CE4823 tetracosa-6,9,12,15,18,21-all-cis-hexaenoyl- 7 0.00203885 NA NA NA NA CoA CE4836 tetracosa-6,9,12,15,18-all-cis-pentaenoyl-CoA 7 0.00203885 NA NA NA NA C00201 Nucleoside triphosphate 8 0.0021019 NA NA NA NA C00222 3-Oxopropanoate 9 0.0032366 NA NA 368 0.333549 C00447 Sedoheptulose 1,7-bisphosphate 9 0.0032366 NA NA NA NA CE2434 trans,cis,cis-2,9,12-octadecatrienoyl-CoA 10 0.00487151 NA NA NA NA CE2596 trans,cis-dodeca-2,5-dienoyl-CoA 10 0.00487151 NA NA NA NA CE2591 trans,cis-hexadeca-2,9-dienoyl-CoA 10 0.00487151 NA NA NA NA CE2594 trans,cis-myristo-2,7-dienoyl-CoA 10 0.00487151 NA NA NA NA CE2432 trans-2-cis,cis-5,8-tetradecatrienoyl-CoA 10 0.00487151 NA NA NA NA Table A.50: The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value EHMN model Rank P-value C00004 Nicotinamide adenine dinucleotide - reduced 1 0.0000892173 135 0.531749 253 0.113878 C00003 Nicotinamide adenine dinucleotide 2 0.0000982315 135 0.531749 241 0.107103 C05576 3,4-Dihydroxyphenylethyleneglycol 3 0.000520894 NA NA 74 C00132 Methanol 3 0.000520894 NA NA 624 0.309013 C00510 Octadecenoyl-CoA (n-C18:1CoA) 4 0.00061122 66 0.269027 498 0.247786 C16218 trans-Octadec-2-enoyl-CoA 4 0.00061122 NA NA NA NA vaccenyl coenzyme A 4 0.00061122 NA NA NA NA C00577 D-Glyceraldehyde 5 0.000678521 NA NA 4 0.00104135 C01094 D-Fructose 1-phosphate 6 0.000684636 NA NA 3 0.000596177 D-Xylulose 1-phosphate 6 0.000684636 NA NA NA C03451 (R)-S-Lactoylglutathione 7 0.000914369 NA NA 1715 0.956589 C00051 Reduced glutathione 8 0.00132034 136 0.533874 1506 0.844639 C00072 L-Ascorbate 9 0.00163147 NA NA 25 0.0125241 C00424 L-Lactaldehyde 0.0016341 NA NA NA NA 10 0.0275416 NA Table A.51: The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using the Recon 1 model in comparison to the adipocyte and EHMN model. A.5.3 Comparison of expression data The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model. 94 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Adipocyte WM KEGG ID WR Metabolite name Rank P-value Rank 1 0.00000292448 eicosadienoyl-CoA (C20:2CoA, n-6) 2 0.0000625222 117 octadecatrienoyl-CoA (C18:3CoA, n-3) 3 0.0000704046 3 0.00735661 O2 4 0.000162911 6 0.0144094 octadecatrienoyl-CoA (C18:3CoA, n-6) 5 0.000181388 1 0.000234023 stearidonyl coenzyme A (C18:4CoA, n-3) 5 0.000181388 1 0.000234023 Coenzyme A 6 0.000904182 42 eicosatetraenoyl-CoA (C20:4CoA, n-6) 7 0.00144942 7 0.0167658 eicosatrienoyl-CoA (C20:3CoA, n-6) 7 0.00144942 7 0.0167658 heptadecenoyl CoA (C17:1CoA, n-8) 8 0.00267006 112 0.452045 eicosenoyl-CoA (C20:1CoA, n-11) 9 0.00291506 87 0.351476 tetradecenoyl-CoA (C14:1CoA, n-5) 9 0.00291506 87 0.351476 docosenoyl-CoA (C22:1CoA, n-9) 10 0.00321812 101 0.403862 C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 10 0.00321812 101 0.403862 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 10 0.00321812 66 0.269027 C00007 C00010 10 P-value octadecadienoyl-CoA (C18:2CoA, n-6) 0.0226904 0.460496 0.178629 Table A.52: The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after LCD’ and ’WR - before LCD vs. after LCD’ based on the adipocyte model. WR KEGG ID WM Metabolite name Rank octadecatrienoyl-CoA (C18:3CoA, n-6) 1 0.000234023 stearidonyl coenzyme A (C18:4CoA, n-3) 1 0.000234023 eicosapentaenoyl-CoA (C20:5CoA, n-3) 2 0.00156232 16 0.0164849 eicosatetraenoyl-CoA (C20:4CoA, n-3) 2 0.00156232 16 0.0164849 octadecatrienoyl-CoA (C18:3CoA, n-3) 3 0.00735661 3 L-Homocysteine 4 0.00774941 46 0.15299 S-(Hydroxymethyl)glutathione 5 0.00786018 90 0.310086 O2 6 0.0144094 4 0.000162911 eicosatetraenoyl-CoA (C20:4CoA, n-6) 7 0.0167658 7 0.00144942 eicosatrienoyl-CoA (C20:3CoA, n-6) 7 0.0167658 7 0.00144942 C00003 Nicotinamide adenine dinucleotide 8 0.0171684 33 0.0961396 C00004 Nicotinamide adenine dinucleotide - reduced 8 0.0171684 33 0.0961396 C00235 Dimethylallyl diphosphate 9 0.0194827 239 0.841729 C00448 Farnesyl diphosphate 9 0.0194827 104 0.378231 C00341 Geranyl diphosphate 9 0.0194827 229 0.815733 C00129 Isopentenyl diphosphate 9 0.0194827 239 0.841729 octadecadienoyl-CoA (C18:2CoA, n-6) 10 0.0226904 1 C00155 C00007 P-value Rank P-value 5 0.000181388 5 0.000181388 0.0000704046 0.00000292448 Table A.53: The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after LCD’ and ’WM - before LCD vs. after LCD’ based on the adipocyte model. 95 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN WM KEGG ID WR Metabolite name Rank P-value C02249 Arachidonyl-CoA 1 0.00000324492 Rank 27 P-value 0.0140488 C01794 Choloyl-CoA 2 0.0000594551 386 0.186458 CE2254 docosanoyl-CoA 2 0.0000594551 605 0.298826 C00412 Stearoyl-CoA 2 0.0000594551 10 CE2257 tetracosanoyl-CoA 2 0.0000594551 739 C02050 Linoleoyl-CoA 3 0.0000701231 1 C00510 Oleoyl-CoA 4 0.000318929 498 C00154 Palmitoyl-CoA 4 0.000318929 2 C05965 12(S)-HPETE 5 0.000330223 1644 C04717 13(S)-HPODE 5 0.000330223 315 0.14657 CE2163 13-hydroxy-(9Z,11E)-octadecadienoate 5 0.000330223 315 0.14657 C04742 15(S)-HETE 5 0.000330223 1644 0.921654 C05966 15(S)-HPETE 5 0.000330223 1674 0.937448 C04805 5(S)-HETE 5 0.000330223 1773 0.987977 C05356 5(S)-HPETE 5 0.000330223 593 0.288994 C14827 9(S)-HPODE 5 0.000330223 1349 0.748942 CE2539 9-hydroxyoctadecadienoate 5 0.000330223 1251 0.699966 CE0852 palmitoleoyl-CoA 6 0.000335923 99 C03069 3-Methylcrotonyl-CoA 7 0.000899727 750 CE0713 3-oxolinoleoyl-CoA 8 0.0011461 C00332 Acetoacetyl-CoA 9 0.00153217 336 C01944 Octanoyl-CoA 9 0.00153217 1473 C00091 Succinyl-CoA 9 0.00153217 36 C00016 FAD 10 0.00159844 307 0.144276 C01352 FADH2 10 0.00159844 866 0.448917 36 0.00304985 0.371139 0.000133458 0.247786 0.000296351 0.921654 0.0385497 0.377265 0.0178061 0.15702 0.830036 0.0178061 Table A.54: The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after LCD’ and ’WR - before LCD vs. after LCD’ based on the EHMN model. WR KEGG ID WM Metabolite name Rank P-value Rank P-value 0.0000701231 C02050 Linoleoyl-CoA 1 0.000133458 3 C00510 Oleoyl-CoA 2 0.000296351 79 C00154 Palmitoyl-CoA 2 0.000296351 4 0.000318929 C00412 Stearoyl-CoA 2 0.000296351 7 0.000456018 C01094 D-Fructose 1-phosphate 3 0.000596177 321 0.159899 C00577 D-Glyceraldehyde 4 0.00104135 1245 0.705411 C06426 (6Z,9Z,12Z)-Octadecatrienoic acid 5 0.00111288 38 0.00776122 C03035 gamma-Linolenoyl-CoA 5 0.00111288 38 0.00776122 CE4815 stearidonoyl-CoA 5 0.00111288 207 0.0863126 CE4824 tetracosa-9,12,15,18,21-all-cis-pentaenoyl-CoA 5 0.00111288 207 0.0863126 CE4837 tetracosa-9,12,15,18-all-cis-tetraenoyl-CoA 5 0.00111288 207 0.0863126 C02249 Arachidonyl-CoA 6 0.00121838 94 0.0290386 CE4809 alpha-linolenoyl-CoA 7 0.00203885 167 0.065261 CE4823 tetracosa-6,9,12,15,18,21-all-cis-hexaenoyl-CoA 7 0.00203885 167 0.065261 CE4836 tetracosa-6,9,12,15,18-all-cis-pentaenoyl-CoA 7 0.00203885 167 0.065261 C00201 Nucleoside triphosphate 8 0.0021019 1407 0.779159 C00222 3-Oxopropanoate 9 0.0032366 401 0.205611 C00447 Sedoheptulose 1,7-bisphosphate 9 0.0032366 401 0.205611 CE2434 trans,cis,cis-2,9,12-octadecatrienoyl-CoA 10 0.00487151 26 0.0049764 CE2596 trans,cis-dodeca-2,5-dienoyl-CoA 10 0.00487151 26 0.0049764 CE2591 trans,cis-hexadeca-2,9-dienoyl-CoA 10 0.00487151 26 0.0049764 CE2594 trans,cis-myristo-2,7-dienoyl-CoA 10 0.00487151 26 0.0049764 CE2432 trans-2-cis,cis-5,8-tetradecatrienoyl-CoA 10 0.00487151 107 0.0193181 0.034329 Table A.55: The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after LCD’ and ’WM - before LCD vs. after LCD’ based on the EHMN model. 96 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Recon 1 WM KEGG ID WR Metabolite name Rank P-value Rank P-value C00510 Octadecenoyl-CoA (n-C18:1CoA) 1 0.00000502544 4 C16218 trans-Octadec-2-enoyl-CoA 1 0.00000502544 529 0.503971 vaccenyl coenzyme A 1 0.00000502544 529 0.503971 tetracosahexaenoyl coenzyme A 2 0.000033348 19 Coenzyme A 3 0.000246448 237 0.213742 alpha-Linolenoyl-CoA 4 0.000370386 529 0.503971 gamma-linolenoyl-CoA 4 0.000370386 529 0.503971 linoelaidyl coenzyme A 4 0.000370386 529 0.503971 linoleic coenzyme A 4 0.000370386 529 0.503971 stearidonyl coenzyme A 4 0.000370386 529 0.503971 tetracosapentaenoyl coenzyme A, n-3 4 0.000370386 195 0.173053 C05272 Hexadecenoyl-CoA (n-C16:1CoA) 5 0.00113402 647 0.601787 C00003 Nicotinamide adenine dinucleotide 6 0.00127546 2 C00422 triacylglycerol (homo sapiens) 7 0.00127821 343 C00004 Nicotinamide adenine dinucleotide - reduced 8 0.0014729 C00268 6,7-Dihydrobiopterin 9 0.00157729 4-Nitrophenyl sulfate 10 0.0019562 23 0.0109026 Dopamine 3-O-sulfate 10 0.0019562 23 0.0109026 C00010 C03035 C16163 C13690 1 871 0.00061122 0.0089341 0.0000982315 0.310791 0.0000892173 0.830442 Table A.56: The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after LCD’ and ’WR - before LCD vs. after LCD’ based on the Recon 1 model. WR KEGG ID WM Metabolite name Rank P-value Rank 8 0.0014729 P-value 6 0.00127546 C00004 Nicotinamide adenine dinucleotide - reduced 1 0.0000892173 C00003 Nicotinamide adenine dinucleotide 2 0.0000982315 C05576 3,4-Dihydroxyphenylethyleneglycol 3 0.000520894 130 0.110313 C00132 Methanol 3 0.000520894 130 0.110313 C00510 Octadecenoyl-CoA (n-C18:1CoA) 4 0.00061122 1 C16218 trans-Octadec-2-enoyl-CoA 4 0.00061122 415 0.40608 vaccenyl coenzyme A 4 0.00061122 415 0.40608 C00577 D-Glyceraldehyde 5 0.000678521 647 0.619548 C01094 D-Fructose 1-phosphate 6 0.000684636 172 0.159806 D-Xylulose 1-phosphate 6 0.000684636 172 0.159806 C03451 (R)-S-Lactoylglutathione 7 0.000914369 607 0.5852 C00051 Reduced glutathione 8 0.00132034 68 C00072 L-Ascorbate 9 0.00163147 664 0.62973 C00424 L-Lactaldehyde 0.0016341 127 0.107009 10 0.00000502544 0.048087 Table A.57: The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after LCD’ and ’WM - before LCD vs. after LCD’ based on the Recon 1 model. 97 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.6 Characterization of the initial molecular events of adipose tissue development and growth during overfeeding in humans (GSE28005) Healthy lean and overweight subjects were submitted to a high fat diet during 56 days [13]: - 18 probands - 3 paired samples per proband, taken at day 0, day 14, day 56 The following comparisons were applied for the calculation of the differentially expressed genes: (i) day 0 vs. day 14 and (ii) day 0 vs. day 56. A.6.1 Differentially expressed genes The following tables show the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Day 0 vs. day 14 Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 4739 NM 001142393 2 neural precursor NM 006403 developmentally NM 182966 down-regulated 9 6954 NM 001093728 2.34864686333649e-06 cell expressed, t-complex 11 2.13866148594628e-05 homolog (mouse) NM 018679 3 1490 NM 001901 4 286002 XM 001715026 connective tissue 2.4394198929303e-05 growth factor hypothetical prot- 2.66722001916861e-05 ein LOC286002 XM 001718146 XM 001718326 5 253635 NM 174931 6 30814 NM 014589 coiled-coil domain 3.88394993829332e-05 containing 75 phospholipase A2, 9.85819950741516e-05 group IIE hsa00564: Glycerophospho- C00093 lipid metabolism hsa00565: Ether lipid metabolism hsa00590: Arachidonic acid metabolism hsa00591: Linoleic acid metabolism hsa00592: alpha-Linolenic acid metabolism hsa01100: Metabolic C00010 C00008 C00001 pathways C00026 C00018 C00083 C00041 C00024 C00097 C00083 C00026 C00164 C00097 C00093 C00249 C00164 C00314 C00422 C00249 C00534 C01944 C03373 C00627 C03373 C05272 C00647 C03373 98 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID hsa04010: MAPK signaling pathway hsa04270: Vascular smooth muscle contraction hsa04370: VEGF signaling pathway hsa04664: Fc epsilon RI signaling pathway hsa04724: Glutamatergic synapse hsa04726: Serotonergic synapse hsa04730: Long-term depression hsa04912: GnRH signaling pathway hsa04972: Pancreatic C00001 secretion hsa04975: Fat digestion C00010 C00422 and absorption hsa05145: Toxoplasmosis 7 79695 NM 024642 UDP-N-acetyl- 0.000115028315666452 hsa00512: Mucin type alpha-D- O-Glycan biosynthesis galactosamine: hsa01100: Metabolic C00010 C00008 C00001 polypeptide pathways C00026 C00018 C00083 N-acetyl- C00041 C00024 C00097 galactosaminyl- C00083 C00026 C00164 transferase 12 C00097 C00093 C00249 (GalNAc-T12) C00164 C00314 C00422 C00249 C00534 C01944 C03373 C00627 C03373 C05272 C00647 C03373 8 401068 hypothetical gene XR 041577 supported by XR 041578 BC028186 0.000122753694924759 XR 041578 9 84649 NM 032564 diacylglycerol 0.000145614029643437 hsa00561: Glycerolipid O-acyltransferase metabolism homolog 2 (mouse) hsa01100: Metabolic pathways C00093 C00422 C00010 C00008 C00001 C00026 C00018 C00083 C00041 C00024 C00097 C00083 C00026 C00164 C00097 C00093 C00249 C00164 C00314 C00422 C00249 C00534 C01944 C03373 C00627 C03373 C05272 C00647 C03373 hsa04975: Fat digestion C00010 C00422 and absorption 10 166979 NM 152623 cell division cycle 0.000172732888620482 20 homolog B (S. cerevisiae) Table A.58: The top 10 differentially expressed genes from the comparison ’day 0 vs. day 14’ with the corresponding pathways and reporter metabolites. 99 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Day 0 vs. day 56 Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 1 9843 hephaestin 1.40947289382373e-06 NM 001130860 hsa04978: Mineral absorption NM 014799 NM 138737 2 2115 ets variant 1 2.50155485453957e-06 1535 cytochrome b-245, 4.28785657794403e-06 NM 000101 alpha polypeptide NM 004956 3 hsa05202: Transcriptional misregulation in cancer hsa04145: Phagosome C00007 C00027 C00704 hsa04380: Osteoclast C00027 differentiation hsa04670: Leukocyte trans- C00027 endothelial migration hsa05140: Leishmaniasis 4 441024 NM 001004346 methylenetetra- 4.44745625822843e-06 hsa00670: One carbon hydrofolate pool by folate dehydrogenase hsa01100: Metabolic C00005 C00002 C00154 (NADP+ depen-) pathways C00006 C00008 C00164 C00007 C00060 C05272 C00092 C00164 C00122 C00257 C00164 C00426 dent 2-like C00364 C01061 C00365 G00159 G00160 5 1278 NM 000089 collagen, type I, 4.80281529509077e-06 alpha 2 hsa04510: Focal adhesion hsa04512: ECM-receptor interaction hsa04974: Protein digestion and absorption hsa05146: Amoebiasis 6 7 6423 secreted frizzled- NM 003013 related protein 2 10644 NM 001007225 8 insulin-like growth C00027 hsa04310: Wnt signaling pathway 1.15857882753486e-05 factor 2 mRNA NM 006548 binding protein 2 27115 phosphodiesterase NM 018945 5.99850512222573e-06 1.24992059917824e-05 7B hsa00230: Purine C00002 metabolism C00008 hsa05032: Morphine addiction 9 7140 NM 001042780 troponin T type 3 1.43295940851921e-05 (skeletal, fast) NM 001042781 NM 001042782 NM 006757 10 54829 asporin 1.45832184738187e-05 NM 017680 Table A.59: The top 10 differentially expressed genes from the comparison ’day 0 vs. day 56’ with the corresponding pathways and reporter metabolites. A.6.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. 100 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Day 0 vs. day 14 Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank 1 0.0031377 C00041 L-Alanine 2 0.00446266 1183 C00097 L-Cysteine 2 0.00446266 26 0.0165483 4 C00164 Acetoacetate 3 0.00471785 67 0.0335582 823 0.648248 eicosadienoyl-CoA (C20:2CoA, n-6) 4 0.00673465 NA NA NA NA dodecanoate (C12:0) 5 0.00717003 35 0.0208393 136 0.106821 Eicosanoate (n-C20:0) 5 0.00717003 NA NA NA NA C00249 hexadecanoate (n-C16:0) 5 0.00717003 1227 0.605387 591 0.447948 C01530 octadecanoate (n-C18:0) 5 0.00717003 382 0.179519 259 0.194384 pentadecanoate (C15:0) 5 0.00717003 NA NA NA NA tetradecanoate (C14:0) 5 0.00717003 35 0.0208393 59 0.0363937 docosenoyl-CoA (C22:1CoA, n-9) 6 0.00736875 NA NA NA NA C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 6 0.00736875 470 0.231362 1195 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 6 0.00736875 667 0.332364 213 C03373 5-amino-1-(5-phospho-D-ribosyl)imidazole 7 0.00862875 4 C00026 2-Oxoglutarate 8 0.0120078 1317 0.649452 522 0.395589 sn-Glycerol 3-phosphate 9 0.0135386 880 0.430031 NA NA 0.015976 155 0.0728419 C00010 Coenzyme A 10 0.583894 0.00475059 10 P-value Malonyl-CoA C06424 0.0101903 Recon 1 model Rank C00083 C02679 12 P-value 0.00818013 154 0.121584 0.00216723 0.96341 0.163803 9 0.00590893 1003 0.796107 Table A.60: The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value C00024 Acetyl-CoA 1 0.00143499 227 0.669832 281 0.211005 CE5166 25(S)-trihydroxycoprostanoyl-CoA 2 0.00148815 NA NA NA NA C00093 sn-Glycerol 3-phosphate 3 0.00199595 NA NA 44 0.0261254 C03373 Aminoimidazole ribotide 4 0.00475059 7 C03125 L-Cysteinyl-tRNA(Cys) 5 0.00489944 NA NA NA NA C01639 tRNA(Cys) 5 0.00489944 NA NA NA NA C00026 2-Oxoglutarate 6 0.00542615 217 0.636648 522 0.395589 C02249 Arachidonyl-CoA 7 0.00641238 NA NA 826 0.651886 C00018 Pyridoxal phosphate 8 0.00698159 NA NA NA NA C00534 Pyridoxamine 8 0.00698159 NA NA 212 0.16237 C00647 Pyridoxamine phosphate 8 0.00698159 NA NA NA NA C00314 Pyridoxine 8 0.00698159 NA NA 212 0.16237 C00627 Pyridoxine phosphate 8 0.00698159 NA NA NA NA C03721 Protein tyrosine-O-sulfate 9 0.00761396 NA NA NA NA C00008 ADP 10 0.00790762 211 0.600735 40 0.0244097 0.00862875 9 0.00590893 Table A.61: The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank 45 P-value 0.1134 EHMN model Rank L-Cysteine 1 0.000527156 C00422 triacylglycerol (homo sapiens) 2 0.000885051 NA NA C00164 Acetoacetate 3 0.00135937 240 0.753101 67 C00001 H2O 4 0.00302208 114 0.323002 381 0.178984 C19586 8,9 epxoy aflatoxin B1 5 0.00329827 NA NA NA NA C06800 aflatoxin B1 5 0.00329827 NA NA NA NA C14497 6 beta hydroxy testosterone 6 0.00346225 NA NA 1739 0.841028 C00249 Hexadecanoate (n-C16:0) 7 0.0046149 5 0.00717003 1227 0.605387 C03373 5-amino-1-(5-phospho-D-ribosyl)imidazole 8 0.00590893 7 0.00862875 4 C00083 Malonyl-CoA 9 0.00818013 1 0.0031377 C01944 Octanoyl-CoA (n-C8:0CoA) 10 0.00924781 115 0.324839 26 P-value C00097 697 12 413 0.0165483 0.349489 0.0335582 0.00475059 0.0101903 0.19678 Table A.62: The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 101 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Day 0 vs. day 56 Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank C00164 Acetoacetate 1 0.00133727 89 C01342 Ammonium 2 0.00149079 C00122 Fumarate 3 0.0141678 C00092 D-Glucose 6-phosphate 4 C00365 dUMP P-value Recon 1 model Rank P-value 0.0521501 472 0.351164 1899 0.892111 504 0.376983 209 0.110517 138 0.10333 0.014582 190 0.101862 682 0.527185 5 0.0159453 143 0.0767043 198 0.136195 heptadecenoyl CoA (C17:1CoA, n-8) 6 0.0259099 NA NA NA NA C00006 Nicotinamide adenine dinucleotide phosphate 7 0.0290297 1263 0.599895 1079 0.867003 C00005 Nicotinamide adenine dinucleotide phosphate 7 0.0290297 498 0.256139 1100 0.882603 dTMP 8 0.0359442 1732 0.820842 36 0.0295632 eicosadienoyl-CoA (C20:2CoA, n-6) 9 0.0373189 NA NA NA NA - reduced C00364 C00027 Hydrogen peroxide 10 0.0392387 1989 0.937971 406 0.292518 C00007 O2 10 0.0392387 192 0.102931 638 0.494795 C00704 Superoxide 10 0.0392387 421 0.219928 566 0.425701 Table A.63: The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value G00159 (Gal)2 (GalNAc)1 (GlcA)2 (Xyl)1 (Ser)1 1 0.00119035 NA NA NA NA CN0004 IdoAbeta1-3GalNAcbeta1-4IdoAbeta1- 1 0.00119035 NA NA NA NA 3GalNAcbeta1-4GlcAbeta1-3Galbeta13Galbeta1-4Xylbeta1-Ser-peptide C01777 Acylcholine 2 0.00133707 NA NA NA NA C00060 Carboxylate 2 0.00133707 NA NA NA NA G00160 (Gal)2 (GalNAc)2 (GlcA)2 (Xyl)1 (Ser)1 3 0.00240525 NA NA 826 0.635433 CN0005 Chondroitin sulfate C 3 0.00240525 NA NA NA NA CN0006 Chondroitin sulfate D 3 0.00240525 NA NA NA NA CN0007 Chondroitin sulfate E 3 0.00240525 NA NA NA NA C00426 Dermatan sulfate 3 0.00240525 NA NA NA NA CN0002 GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1- 3 0.00240525 NA NA NA NA 3 0.00240525 NA NA NA NA 3 0.00240525 NA NA NA NA 4GlcAbeta1-3Galbeta1-3Galbeta1-4Xylbeta1Ser-peptide CN0003 GlcAbeta1-3GalNAcbeta1-4IdoAbeta13GalNAcbeta1-4GlcAbeta1-3Galbeta13Galbeta1-4Xylbeta1-Ser-peptide CN0001 IdoAbeta1-3GalNAcbeta1-4GlcAbeta13Galbeta1-3Galbeta1-4Xylbeta1-Ser-peptide C00164 Acetoacetate 4 0.00298485 179 0.562478 472 0.351164 C00008 ADP 5 0.00414578 25 0.0721646 725 0.559198 C00002 ATP 5 0.00414578 88 0.220569 493 0.364706 CE5869 lysyl-proline 6 0.00477995 NA NA NA NA CE5868 N-acetyl-seryl-aspartate 6 0.00477995 NA NA NA NA CE5867 N-acetyl-seryl-aspartyl-lysyl-proline 6 0.00477995 NA NA NA NA C01061 4-Fumarylacetoacetate 7 0.00491131 52 0.123833 152 0.110522 CE0852 palmitoleoyl-CoA 8 0.0073061 NA NA NA NA CE5787 kinetensin 1-3 9 0.00811106 NA NA NA NA C00257 D-Gluconic acid 10 0.00922411 NA NA NA NA Table A.64: The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the EHMN model in comparison to the adipocyte and Recon 1 model. 102 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Recon1 model KEGG ID C00164 Metabolite name Adipocyte model Rank P-value Rank P-value Acetoacetate 1 0.00034461 179 0.562478 EHMN model Rank P-value 89 0.0521501 NA NA NA 0.159726 126 0.0689563 R total 3 Coenzyme A 2 0.000626834 NA C00510 Octadecenoyl-CoA (n-C18:1CoA) 3 0.000677739 71 C16218 trans-Octadec-2-enoyl-CoA 3 0.000677739 NA NA NA NA vaccenyl coenzyme A 3 0.000677739 NA NA NA NA C05272 Hexadecenoyl-CoA (n-C16:1CoA) 4 0.00142397 16 0.0515015 C00412 Stearoyl-CoA (n-C18:0CoA) 5 0.00144629 71 0.159726 alpha-Linolenoyl-CoA 6 0.0020225 NA NA 208 0.11006 linoelaidyl coenzyme A 6 0.0020225 NA NA NA NA linoleic coenzyme A 6 0.0020225 NA NA NA NA tetracosapentaenoyl coenzyme A, n-3 6 0.0020225 NA NA NA NA C01342 Ammonium 7 0.0032767 185 0.576777 1899 C00154 Palmitoyl-CoA (n-C16:0CoA) 8 0.00335413 102 0.265511 151 0.0798932 arachidyl coenzyme A 9 0.00409846 NA NA NA NA cervonyl coenzyme A 9 0.00409846 NA NA NA NA docosa-4,7,10,13,16-pentaenoyl coenzyme A 9 0.00409846 NA NA NA NA heptadecanoyl coa 9 0.00409846 NA NA NA NA Hexacosanoyl-CoA (n-C26:0CoA) 9 0.00409846 NA NA NA NA lignocericyl coenzyme A 9 0.00409846 NA NA NA NA nervonyl coenzyme A 9 0.00409846 NA NA NA NA pentadecanoyl Coenzyme A 9 0.00409846 NA NA NA NA tetracosapentaenoyl coenzyme A, n-6 9 0.00409846 NA NA NA NA C16171 tetracosatetraenoyl coenzyme A 9 0.00409846 NA NA NA NA C01211 Procollagen 5-hydroxy-L-lysine 10 0.00797606 NA NA 1897 C16740 Procollagen L-lysine 10 0.00797606 NA NA NA C16173 1194 24 0.570979 0.0162869 0.892111 0.890457 NA Table A.65: The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the Recon 1 model in comparison to the adipocyte and EHMN model. A.6.3 Comparison of expression data The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model. Adipocyte day 0 vs. day 14 KEGG ID day 0 vs. day 56 Metabolite name Rank P-value Rank P-value C00083 Malonyl-CoA 1 0.0031377 69 0.157038 C00041 L-Alanine 2 0.00446266 46 0.113193 C00097 L-Cysteine 2 0.00446266 99 0.25144 C00164 Acetoacetate 3 0.00471785 179 eicosadienoyl-CoA (C20:2CoA, n-6) 4 0.00673465 9 0.0373189 dodecanoate (C12:0) 5 0.00717003 32 0.0852161 Eicosanoate (n-C20:0) 5 0.00717003 32 0.0852161 C00249 hexadecanoate (n-C16:0) 5 0.00717003 32 0.0852161 C01530 octadecanoate (n-C18:0) 5 0.00717003 53 0.124034 pentadecanoate (C15:0) 5 0.00717003 32 0.0852161 tetradecanoate (C14:0) 5 0.00717003 32 0.0852161 docosenoyl-CoA (C22:1CoA, n-9) 6 0.00736875 16 0.0515015 C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 6 0.00736875 16 0.0515015 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 6 0.00736875 71 0.159726 C03373 5-amino-1-(5-phospho-D-ribosyl)imidazole 7 0.00862875 37 0.0937082 C00026 2-Oxoglutarate 8 0.0120078 42 0.101344 sn-Glycerol 3-phosphate 9 0.0135386 17 0.0534647 0.015976 65 0.152882 C02679 C06424 C00010 Coenzyme A 10 0.562478 Table A.66: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0 vs. day 56’ based on the adipocyte model. 103 APPENDIX A. RESULTS OF ALL SELECTED DATASETS day 0 vs. day 56 KEGG ID day 0 vs. day 14 Metabolite name Rank P-value Rank P-value C00164 Acetoacetate 1 0.00133727 240 0.753101 C01342 Ammonium 2 0.00149079 210 0.595094 C00122 Fumarate 3 0.0141678 40 0.0995445 C00092 D-Glucose 6-phosphate 4 0.014582 99 0.285809 C00365 dUMP 5 0.0159453 175 0.498916 heptadecenoyl CoA (C17:1CoA, n-8) 6 0.0259099 20 C00006 Nicotinamide adenine dinucleotide phosphate 7 0.0290297 194 0.543631 C00005 Nicotinamide adenine dinucleotide phosphate - reduced 7 0.0290297 194 0.543631 C00364 dTMP 8 0.0359442 233 0.707671 eicosadienoyl-CoA (C20:2CoA, n-6) 9 0.0373189 4 0.0372373 0.00673465 C00027 Hydrogen peroxide 10 0.0392387 230 0.698221 C00007 O2 10 0.0392387 107 0.30699 C00704 Superoxide 10 0.0392387 31 0.0671406 Table A.67: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0 vs. day 14’ based on the adipocyte model. EHMN day 0 vs. day 14 KEGG ID day 0 vs. day 56 Metabolite name Rank P-value Rank P-value C00024 Acetyl-CoA 1 0.00143499 19 0.0139786 CE5166 25(S)-trihydroxycoprostanoyl-CoA 2 0.00148815 191 0.102176 C00093 sn-Glycerol 3-phosphate 3 0.00199595 369 0.192557 C03373 Aminoimidazole ribotide 4 0.00475059 168 0.0897013 C03125 L-Cysteinyl-tRNA(Cys) 5 0.00489944 174 0.0946223 C01639 tRNA(Cys) 5 0.00489944 174 0.0946223 C00026 2-Oxoglutarate 6 0.00542615 535 0.274675 C02249 Arachidonyl-CoA 7 0.00641238 23 0.0161232 C00018 Pyridoxal phosphate 8 0.00698159 15 0.0109335 C00534 Pyridoxamine 8 0.00698159 15 0.0109335 C00647 Pyridoxamine phosphate 8 0.00698159 15 0.0109335 C00314 Pyridoxine 8 0.00698159 15 0.0109335 C00627 Pyridoxine phosphate 8 0.00698159 15 0.0109335 C03721 Protein tyrosine-O-sulfate 9 0.00761396 33 0.0188186 C00008 ADP 10 0.00790762 625 0.320019 Table A.68: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0 vs. day 56’ based on the EHMN model. 104 APPENDIX A. RESULTS OF ALL SELECTED DATASETS day 0 vs. day 56 KEGG ID day 0 vs. day 14 Metabolite name Rank P-value Rank P-value G00159 (Gal)2 (GalNAc)1 (GlcA)2 (Xyl)1 (Ser)1 1 0.00119035 1045 0.513807 CN0004 IdoAbeta1-3GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-4GlcAbeta1- 1 0.00119035 1045 0.513807 3Galbeta1-3Galbeta1-4Xylbeta1-Ser-peptide C01777 Acylcholine 2 0.00133707 1987 0.947088 C00060 Carboxylate 2 0.00133707 1282 0.62891 G00160 (Gal)2 (GalNAc)2 (GlcA)2 (Xyl)1 (Ser)1 3 0.00240525 1346 0.663502 CN0005 Chondroitin sulfate C 3 0.00240525 1346 0.663502 CN0006 Chondroitin sulfate D 3 0.00240525 1346 0.663502 CN0007 Chondroitin sulfate E 3 0.00240525 1346 0.663502 C00426 Dermatan sulfate 3 0.00240525 1346 0.663502 CN0002 GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-4GlcAbeta1-3Galbeta1- 3 0.00240525 1346 0.663502 3 0.00240525 1819 0.879743 3 0.00240525 1572 0.759935 3Galbeta1-4Xylbeta1-Ser-peptide CN0003 GlcAbeta1-3GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-4GlcAbeta13Galbeta1-3Galbeta1-4Xylbeta1-Ser-peptide CN0001 IdoAbeta1-3GalNAcbeta1-4GlcAbeta1-3Galbeta1-3Galbeta1-4Xylbeta1Ser-peptide C00164 Acetoacetate 4 0.00298485 67 0.0335582 C00008 ADP 5 0.00414578 40 0.0224009 C00002 ATP 5 0.00414578 147 0.0694499 CE5869 lysyl-proline 6 0.00477995 405 0.18819 CE5868 N-acetyl-seryl-aspartate 6 0.00477995 405 0.18819 CE5867 N-acetyl-seryl-aspartyl-lysyl-proline 6 0.00477995 405 0.18819 C01061 4-Fumarylacetoacetate 7 0.00491131 2019 0.959752 CE0852 palmitoleoyl-CoA 8 0.0073061 369 0.175351 CE5787 kinetensin 1-3 9 0.00811106 1277 0.627053 C00257 D-Gluconic acid 10 0.00922411 353 0.165674 Table A.69: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0 vs. day 14’ based on the EHMN model. Recon 1 day 0 vs. day 14 KEGG ID day 0 vs. day 56 Metabolite name Rank P-value Rank P-value C00097 L-Cysteine 1 0.000527156 425 0.307966 C00422 triacylglycerol (homo sapiens) 2 0.000885051 1192 0.959725 C00164 Acetoacetate 3 0.00135937 472 0.351164 C00001 H2O 4 0.00302208 458 0.333773 C19586 8,9 epxoy aflatoxin B1 5 0.00329827 339 0.234191 C06800 aflatoxin B1 5 0.00329827 339 0.234191 C14497 6 beta hydroxy testosterone 6 0.00346225 333 0.230306 C00249 Hexadecanoate (n-C16:0) 7 0.0046149 391 0.283537 C03373 5-amino-1-(5-phospho-D-ribosyl)imidazole 8 0.00590893 108 0.0838115 C00083 Malonyl-CoA 9 0.00818013 367 0.264933 C01944 Octanoyl-CoA (n-C8:0CoA) 10 0.00924781 284 0.194032 Table A.70: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0 vs. day 56’ based on the Recon 1 model. 105 APPENDIX A. RESULTS OF ALL SELECTED DATASETS day 0 vs. day 56 KEGG ID C00164 day 0 vs. day 14 Metabolite name Rank P-value Rank P-value Acetoacetate 1 0.00034461 823 0.648248 R total 3 Coenzyme A 2 0.000626834 240 0.183677 C00510 Octadecenoyl-CoA (n-C18:1CoA) 3 0.000677739 213 0.163803 C16218 trans-Octadec-2-enoyl-CoA 3 0.000677739 280 0.210177 vaccenyl coenzyme A 3 0.000677739 280 0.210177 C05272 Hexadecenoyl-CoA (n-C16:1CoA) 4 0.00142397 1195 C00412 Stearoyl-CoA (n-C18:0CoA) 5 0.00144629 113 0.0831994 alpha-Linolenoyl-CoA 6 0.0020225 280 0.210177 linoelaidyl coenzyme A 6 0.0020225 280 0.210177 linoleic coenzyme A 6 0.0020225 280 0.210177 tetracosapentaenoyl coenzyme A, n-3 6 0.0020225 357 0.276244 C01342 Ammonium 7 0.0032767 958 0.757085 C00154 Palmitoyl-CoA (n-C16:0CoA) 8 0.00335413 125 0.0939063 arachidyl coenzyme A 9 0.00409846 826 0.651886 cervonyl coenzyme A 9 0.00409846 826 0.651886 docosa-4,7,10,13,16-pentaenoyl coenzyme A 9 0.00409846 826 0.651886 heptadecanoyl coa 9 0.00409846 280 0.210177 Hexacosanoyl-CoA (n-C26:0CoA) 9 0.00409846 357 0.276244 lignocericyl coenzyme A 9 0.00409846 357 0.276244 nervonyl coenzyme A 9 0.00409846 357 0.276244 pentadecanoyl Coenzyme A 9 0.00409846 280 0.210177 tetracosapentaenoyl coenzyme A, n-6 9 0.00409846 357 0.276244 C16171 tetracosatetraenoyl coenzyme A 9 0.00409846 357 0.276244 C01211 Procollagen 5-hydroxy-L-lysine 10 0.00797606 294 0.226234 C16740 Procollagen L-lysine 10 0.00797606 294 0.226234 C16173 0.96341 Table A.71: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0 vs. day 14’ based on the Recon 1 model. 106 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.7 Hypoxia-induced modulation of gene expression in human adipocytes (GSE34007) Human adipocytes (Zen-bio cells) were incubated in hypoxic conditions (1% O2 ) for 24 h. Control human adipocytes were incubated under normoxic conditions (21% O2 ) [94]. The comparison of normoxic against hypoxic conditions is used for the calculation of the differentially expressed genes. A.7.1 Differentially expressed genes The following table shows the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Normoxic vs. hypoxic conditions Rank EntrezID GeneName P-value Pathway egl nine homolog 3 1.14428553605755e-09 Adipocyte EHMN Recon1 hsa05200: Pathways in C00122 C00149 C00149 cancer C00149 C01245 hsa05211: Renal cell C00122 C00149 C00149 carcinoma C00149 C00301 RefSeqID 1 112399 NM 022073 2 5138 NM 002599 (C. elegans) phosphodiesterase hsa00230: Purine C00002 C00059 2A, cGMP- 1.42708417065616e-09 metabolism C00008 C00301 stimulated hsa05032: Morphine addiction 3 6927 HNF1 homeobox A 1.96590456276061e-09 metallothionein 3 3.565624773826e-09 chemokine (C-X-C 4.24736720895016e-09 NM 000545 4 4504 hsa04950: Maturity onset diabetes of the young NM 005954 5 7852 NM 001008540 motif) receptor 4 hsa04060: Cytokinecytokine receptor interaction hsa04062: Chemokine signaling pathway hsa04144: Endocytosis C01245 C00002 C00008 hsa04360: Axon guidance hsa04670: Leukocyte transendothelial migration hsa04672: Intestinal immune network for IgA production 6 146439 7 768 coiled-coil domain 8.16911348924392e-09 containing 64B NM 001216 8 54210 NM 018643 carbonic 1.03629888901659e-08 anhydrase IX triggering receptor 1.59408005433769e-08 expressed on myeloid cells 1 9 362 aquaporin 5 1.65740906047528e-08 NM 001651 10 5055 NM 001143818 hsa04970: Salivary C01245 C01330 secretion serpin peptidase 2.30141016000521e-08 hsa05146: Amoebiasis C01245 inhibitor, clade B (ovalbumin), member 2 Table A.72: The top 10 differentially expressed genes from the comparison ’normoxic vs. hypoxic conditions’ with the corresponding pathways and reporter metabolites. 107 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.7.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. Normoxic vs. hypoxic conditions Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value 0.0109453 Recon 1 model Rank P-value C00606 3-Sulfino-L-alanine 1 0.00681743 26 33 0.0257915 C00445 5,10-Methenyltetrahydrofolate 2 0.00728798 1309 C00143 5,10-Methylenetetrahydrofolate 3 0.00990644 67 0.637251 107 0.0754446 0.0270489 198 C00149 L-Malate 4 0.0108091 96 0.139875 0.0375185 3 C00079 L-Phenylalanine 5 0.0147974 1091 0.522332 512 0.416332 C00078 L-Tryptophan 5 0.0147974 1363 0.658911 763 0.653734 C00082 L-Tyrosine 5 0.0147974 1363 0.658911 763 0.653734 C00049 L-Aspartate 6 0.0171947 607 0.262324 7 C00234 10-Formyltetrahydrofolate 7 0.019219 3 C00058 Formate 7 0.019219 1031 C01107 (R)-5-Phosphomevalonate 8 0.0236819 38 C00008 ADP 8 0.0236819 514 0.215428 277 0.198629 C00002 ATP 8 0.0236819 961 0.454433 200 0.142798 C00122 Fumarate 9 0.0290069 1501 0.718153 48 0.0371757 C00864 (R)-Pantothenate 10 0.0300113 64 0.0253678 37 0.0290031 0.00164668 0.491965 0.0172219 0.00385016 0.00724726 89 0.0615075 750 0.648936 21 0.0199861 Table A.73: The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using the adipocyte model in comparison to the EHMN and Recon 1 model. EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value C00094 Sulfite 1 0.00129194 NA NA C00301 ADPribose 2 0.00152273 NA NA C00234 10-Formyltetrahydrofolate 3 0.00164668 7 C11555 1D-myo-Inositol 1,4,5,6-tetrakisphosphate 4 0.0026786 NA NA 681 0.58955 C01245 D-myo-Inositol 1,4,5-trisphosphate 4 0.0026786 NA NA 708 0.619569 C02249 Arachidonyl-CoA 5 0.00275275 NA NA 404 0.308764 C14819 Fe3+ 6 0.00276528 NA NA 832 0.703891 C00059 Sulfate 7 0.00278386 NA NA 826 0.700155 C02939 3-Methylbutanoyl-CoA 8 0.00313494 208 0.701614 68 C00025 L-Glutamate 9 0.00338359 215 0.722205 303 C00149 (S)-Malate 10 0.00378207 4 0.019219 0.0108091 2 0.00355222 5 0.00539267 89 0.0615075 0.0519052 0.214459 3 0.00385016 Table A.74: The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value C00301 ADPribose 1 0.00200643 NA NA C00094 Sulfite 2 0.00355222 NA NA C00149 L-Malate 3 0.00385016 4 C00500 Biliverdin 4 0.00462645 C00153 Nicotinamide 5 C00003 Nicotinamide adenine dinucleotide C00006 EHMN model Rank 12 1 P-value 0.0042943 0.00129194 0.0108091 96 0.0375185 NA NA 22 0.0101831 0.00539267 NA NA 655 0.28395 5 0.00539267 179 0.556851 167 0.0659428 Nicotinamide adenine dinucleotide phosphate 5 0.00539267 26 0.0542526 562 0.239822 C00399 Ubiquinone-10 6 0.00604565 NA C00049 L-Aspartate 7 0.00724726 6 C01330 Sodium 8 0.00877595 93 C00010 Coenzyme A 9 0.010932 34 C00237 Carbon monoxide 10 0.0109933 NA NA 21 0.00911377 C00023 Fe2+ 10 0.0109933 NA NA NA NA NA 0.0171947 16 0.0073437 607 0.262324 0.293479 2104 0.992782 0.0766474 1312 0.639169 Table A.75: The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using the Recon 1 model in comparison to the adipocyte and EHMN model. 108 APPENDIX A. RESULTS OF ALL SELECTED DATASETS A.8 Differential gene expression in adipose tissue from obese human subjects during weight loss and weight maintenance (GSE35411) Low calorie diet (LCD) containing 1200 kcal/day for three months. Following the weight reduction phase for six month follow-up period [95]: - 3 paired samples per proband, taken at baseline, after weight reduction, after weight maintenance phase The following comparisons were applied for the calculation of the differentially expressed genes: (i) baseline vs. after weight reduction and (ii) baseline vs. after weight maintenance phase A.8.1 Differentially expressed genes The following tables show the top 10 differentially expressed genes, the pathways they are involved in, and those top 10 reporter metabolites, which are also involved in these pathways. Baseline vs. after weight reduction Rank EntrezID GeneName P-value Pathway RefSeqID 1 8365 BC010926.1 HIST1H4H - 2.38006191220476e-07 histone cluster hsa05322: Systemic lupus 1, H4h 2 55973 NM 001008406.1 BCAP29 - B-cell hsa05034: Alcoholism erythematosus 6.2098504477583e-07 receptor-associated protein 29 3 1622 binding inhibitor M15887.1 (GABA receptor NM 001079862.1 modulator, acyl- NM 001079863.1 CoA binding NM 020548.5 4 DBI - diazepam CR456956.1 55969 6.27199638434536e-07 pathway protein) C20orf24 - AF274936.1 chromosome 20 BC001871.1 open reading BC004446.1 frame 24 hsa03320: PPAR signaling 6.67608370868795e-07 NM 018840.2 NM 199483.1 109 Adipocyte EHMN Recon1 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 5 125 ADH1B - alcohol hsa00010: Glycolysis/ C00111 NM 000668.3 dehydrogenase 1B 1.113560790908e-06 Gluconeogenesis C00236 (class I), beta hsa00071: Fatty acid polypeptide metabolism C00332 hsa00350: Tyrosine C00122 metabolism C01036 C01036 C01179 hsa00830: Retinol metabolism hsa00980: Metabolism of xenobiotics by cytochrome P450 hsa00982: Drug metabolism- cytochrome P450 hsa01100: Metabolic C00097 C00003 C00003 pathways C00111 C00004 C00332 C00122 C00100 C00906 C00199 C00577 C01024 C00236 C01024 C01036 C00606 C01051 C01036 C03845 C01179 C05437 C03684 C05446 C05467 6 23086 AY099469.1 EXPH5 - 1.1786953108756e-06 exophilin 5 NM 015065.1 7 55904 AY147037.1 8 MLL5 - myeloid/ NM 018682.3 mixed-lineage NM 182931.2 leukemia 5 1806 NM 000110.3 1.50324865193418e-06 lymphoid or DPYD - dihydro- hsa00310: Lysine C00332 degradation 1.90562620638919e-06 hsa00240: Pyrimidine pyrimidine metabolism dehydrogenase hsa00410: beta-Alanine C00906 C00100 metabolism hsa00770: Pantothenate C00097 and CoA biosynthesis hsa00983: Drug metabolism - other enzymes hsa01100: Metabolic C00097 C00003 C00003 pathways C00111 C00004 C00332 C00122 C00100 C00906 C00199 C00577 C01024 C00236 C01024 C01036 C00606 C01051 C01036 C03845 C01179 C05437 C03684 C05446 C05467 9 9669 NM 015904.3 EIF5B - eukaryotic 1.94232219398723e-06 hsa03013: RNA transport 2.29350303664699e-06 hsa04510: Focal adhesion translation initiation factor 5B 10 1290 BC086874.1 NM 000393.3 COL5A2 collagen, type V, hsa04512: ECM-receptor alpha 2 interaction BC043613.1 hsa04974: Protein digestion C00097 and absorption hsa05146: Amoebiasis Table A.76: The top 10 differentially expressed genes from the comparison ’baseline vs. after weight reduction’ with the corresponding pathways and reporter metabolites. 110 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Baseline vs. after weight maintenance phase Rank EntrezID GeneName P-value Pathway JAZF1 - JAZF zinc 6.73014226330345e-07 Adipocyte EHMN Recon1 hsa00052: Galactose C00031 C00031 metabolism C00124 RefSeqID 1 221895 NM 175061.3 2 128989 C22orf25 - chromo- BC041339.1 some 22 open read- NM 152906.2 3 finger 1 2720 M27508.1 7.17308804478182e-07 ing frame 25 GLB1 - galacto- 1.18268373244184e-06 sidase, beta 1 M34423.1 C00267 NM 000404.2 hsa00511: Other glycan NM 001079811.1 degradation hsa00531: Glycosaminoglycan degradation hsa00600: Sphingolipid C00195 metabolism C01190 C00195 C01290 hsa00604: Glycosphingolipid biosynthesis ganglio series hsa01100: Metabolic C00010 C00001 C00001 pathways C00024 C00005 C00005 C00083 C00006 C00006 C00100 C00024 C00031 C00197 C00031 C00080 C05272 C00064 C00195 C00124 G00019 C00195 G00163 C00267 G00164 C01190 C01290 hsa04142: Lysosome 4 9659 AL832024.2 5 6 PDE4DIP - NM 001002811.1 4D interacting NM 001002812.1 protein 116441 TM4SF18 - trans- BC014339.1 membrane 4 L six NM 138786.1 family member 18 2752 BC051726.1 1.55563331646695e-06 phosphodiesterase GLUL - glutamate- 2.08304076554401e-06 3.67979597866496e-06 ammonia ligase hsa00250: Alanine, C00064 aspartate and glutamate NM 001033044.1 metabolism NM 001033056.1 hsa00330: Arginine and NM 002065.4 C00064 proline metabolism hsa00630: Glyoxylate and C00024 C00024 dicarboxylate metabolism C00100 C00064 C00197 hsa01100: Metabolic C00010 C00001 C00001 pathways C00024 C00005 C00005 C00083 C00006 C00006 C00100 C00024 C00031 C00197 C00031 C00080 C05272 C00064 C00195 C00124 G00019 C00195 G00163 C00267 G00164 C01190 C01290 hsa04724: Glutamatergic C00064 synapse hsa04727: GABAergic synapse 7 57124 NM 020404.2 CD248 - CD248 3.83457619319249e-06 molecule, endosialin 111 C00064 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Rank EntrezID GeneName P-value Pathway Adipocyte EHMN Recon1 RefSeqID 8 54849 FLJ20186 - BC015482.2 differentially BC105592.1 expressed in FDCP NM 017702.2 4.75472638872404e-06 8 homolog NM 207514.1 9 54502 NM 019027.1 FLJ20273 - RNA 5.01882280669899e-06 binding motif protein 47 10 5783 PTPN13 - protein D21209.1 tyrosine phosphat- NM 080683.1 ase, non-receptor D21210.1 type 13 (APO-1/ NM 006264.1 CD95 (Fas)-asso- D21211.1 NM 080684.1 6.15726497888553e-06 ciated phosphatase) NM 080685.1 U12128.1 Table A.77: The top 10 differentially expressed genes from the comparison ’baseline vs. after weight maintenance phase’ with the corresponding pathways and reporter metabolites. A.8.2 Comparison between the models The following tables show the top 10 reporter metabolites of one model in comparison to the rank of these metabolites using the other two models and the same expression data. Baseline vs. after weight reduction Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank P-value Recon 1 model Rank P-value C01036 4-Maleylacetoacetate 1 0.005185 19 0.00297981 4 0.0023486 C00536 Inorganic triphosphate 2 0.00926387 29 0.00607262 C00111 Dihydroxyacetone phosphate 3 0.0115888 91 0.0249449 51 7 0.00493835 0.0317427 C00606 3-Sulfino-L-alanine 4 0.0138944 52 0.0125323 12 0.0089768 C00097 L-Cysteine 5 0.0164945 109 0.0340762 102 0.0643997 C01179 3-(4-Hydroxyphenyl)pyruvate 6 0.0220697 1996 0.949555 640 0.539659 C00122 Fumarate 7 0.0227082 476 0.214497 87 0.053884 C00236 3-Phospho-D-glyceroyl phosphate 8 0.0243508 404 0.175407 323 0.242925 C00199 D-Ribulose 5-phosphate 9 0.0249655 64 0.0161579 19 0.0141511 C03684 6-Pyruvoyl-5,6,7,8-tetrahydropterin 10 0.0267392 29 0.00607262 23 0.0159692 Table A.78: The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’ using the adipocyte model in comparison to the EHMN and Recon 1 model. 112 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN model KEGG ID Metabolite name Adipocyte model Rank P-value Rank P-value Recon 1 model Rank P-value C00412 Stearoyl-CoA 1 0.000119047 120 0.429406 349 0.275854 CE0852 palmitoleoyl-CoA 2 0.000193774 NA NA NA NA C00003 NAD+ 3 0.000216192 75 C02249 Arachidonyl-CoA 4 0.000271037 NA C00004 NADH 5 0.000309983 75 C01024 Hydroxymethylbilane 6 0.000391225 NA NA 1 C00577 D-Glyceraldehyde 7 0.000495807 NA NA 85 0.0528238 C02050 Linoleoyl-CoA 8 0.000546936 NA NA NA NA CE2254 docosanoyl-CoA 9 0.000760086 NA NA NA NA C00100 Propanoyl-CoA 9 0.000760086 240 0.796681 96 0.0615026 CE0713 3-oxolinoleoyl-CoA 10 0.000765249 NA NA NA NA 0.240684 NA 0.240684 9 0.0051958 492 0.394575 15 0.0115902 0.000272881 Table A.79: The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Metabolite name Adipocyte model Rank P-value Rank NA P-value NA EHMN model Rank Hydroxymethylbilane 1 0.000272881 C05437 zymosterol 2 0.000387591 C00906 5,6-Dihydrothymine 3 0.00163198 C01036 4-Maleylacetoacetate 4 0.0023486 C05446 3alpha,7alpha,12alpha,26-Tetrahydroxy- 5 0.00301134 NA NA 734 0.360634 5 0.00301134 NA NA 324 0.141624 NA 27 0.00591737 0.00926387 29 0.00607262 42 0.164672 NA 1 6 P-value C01024 0.000391225 190 0.0680593 NA 85 0.0219635 0.005185 19 0.00297981 5beta-cholestane C05467 3alpha,7alpha,12alpha-Trihydroxy-5beta-24oxocholestanoyl-CoA C01051 Uroporphyrinogen III 6 0.00485445 NA C00536 Inorganic triphosphate 7 0.00493835 2 C03845 Zymostenol 8 0.00514724 29 0.118452 460 C00003 Nicotinamide adenine dinucleotide 9 0.0051958 75 0.240684 3 C00332 Acetoacetyl-CoA 125 0.439106 20 10 0.00570785 0.205838 0.000216192 0.00349762 Table A.80: The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’ using the Recon 1 model in comparison to the adipocyte and EHMN model. Baseline vs. after weight maintenance phase Adipocyte model KEGG ID Metabolite name EHMN model Rank P-value Rank 10 P-value P-value C00024 Acetyl-CoA 1 0.000563428 C00010 Coenzyme A 2 0.00192595 270 0.105376 C00083 Malonyl-CoA 3 0.00537761 141 0.0546307 heptadecenoyl CoA (C17:1CoA, n-8) 4 0.00968814 NA NA NA NA 1-Acyl-sn-glycerol 3-phosphate, adipocyte 5 0.00994868 NA NA NA NA C01342 Ammonium 6 0.011238 0.953823 641 0.546636 C00100 Propanoyl-CoA (C3:0CoA) 7 0.0129976 16 0.00261942 789 0.675516 C00197 3-Phospho-D-glycerate 8 0.0152006 109 0.0393705 408 0.32977 eicosadienoyl-CoA (C20:2CoA, n-6) 9 0.0163288 NA NA NA NA docosenoyl-CoA (C22:1CoA, n-9) 10 0.0194561 NA NA NA NA C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 10 0.0194561 1748 0.833331 317 0.250358 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 10 0.0194561 820 0.389293 116 0.100258 1997 0.000879614 Recon 1 model Rank 287 0.230087 574 0.479357 1013 0.844983 Table A.81: The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance phase’ using the adipocyte model in comparison to the EHMN and Recon 1 model. 113 APPENDIX A. RESULTS OF ALL SELECTED DATASETS EHMN model KEGG ID Adipocyte model Metabolite name Rank P-value Rank P-value Recon 1 model Rank 800 P-value C00001 H2O 1 0.000000289791 164 0.548295 0.686042 C00124 D-Galactose 2 0.0000363024 NA NA 11 0.0142181 C00267 alpha-D-Glucose 3 0.0000936421 NA NA NA NA C00031 D-Glucose 3 0.0000936421 87 0.244664 1086 C00006 NADP+ 4 0.000202882 63 0.173565 3 C00064 L-Glutamine 5 0.000363629 270 0.926882 1018 C01290 beta-D-Galactosyl-1,4-beta-D- 6 0.000459548 NA NA 0.890196 0.00380984 0.85136 933 0.789977 glucosylceramide C01582 Galactose 6 0.000459548 NA NA NA NA C01190 Glucosylceramide 7 0.000463858 NA NA 827 0.707438 C00195 N-Acylsphingosine 8 0.00048447 NA NA 827 0.707438 C00005 NADPH 9 0.000500391 63 0.173565 5 C00024 Acetyl-CoA 10 0.000879614 165 0.564465 287 0.00583327 0.230087 Table A.82: The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance phase’ using the EHMN model in comparison to the adipocyte and Recon 1 model. Recon1 model KEGG ID Adipocyte model Metabolite name Rank P-value Rank P-value EHMN model Rank C00001 H2O 1 0.000273347 164 0.548295 211 C00080 H+ 2 0.000603841 206 0.678839 1717 C00006 Nicotinamide adenine dinucleotide phosphate 3 0.00380984 63 0.173565 4 C00195 ceramide (homo sapiens) 4 0.00432081 NA C00005 Nicotinamide adenine dinucleotide phosphate 5 0.00583327 63 NA P-value 0.0771878 0.819442 0.000202882 996 0.488231 0.173565 1344 0.656163 0.244664 1527 0.737933 - reduced C00031 D-Glucose 6 0.00871297 87 G00163 heparan sulfate, precursor 2 7 0.00982322 NA NA 33 0.0090727 G00164 heparan sulfate, precursor 3 7 0.00982322 NA NA 33 0.0090727 G00165 heparan sulfate, precursor 4 7 0.00982322 NA NA NA NA heparan sulfate, precursor 5 7 0.00982322 NA NA NA NA heparan sulfate, precursor 6 7 0.00982322 NA NA NA NA heparan sulfate, precursor 7 7 0.00982322 NA NA NA NA heparan sulfate, precursor 8 7 0.00982322 NA NA NA NA heparan sulfate, precursor 9 8 0.0106507 NA NA NA NA de-Fuc form of PA6 (w/o peptide linkage) 9 0.0115137 NA NA NA NA keratan sulfate I, degradation product 2 9 0.0115137 NA NA NA NA N-Acetyl-beta-D-glucosaminyl-1,2-alpha-D- 9 0.0115137 NA NA 1795 9 0.0115137 NA NA NA NA 9 0.0115137 NA NA NA NA G00019 0.853731 mannosyl-1,3-(N-acetyl-beta-D-glucosaminyl1,2-alpha-D-mannosyl-1,6)-(N-acetyl-betaD-glucosaminyl-1,4)-beta-D-mannosyl-1,4-Nacetyl-beta-D-glucosaminyl-R n2m2nmasn (w/o peptide linkage) protein-linked asparagine residue (N- glycosylation site) C00237 Carbon monoxide 10 0.0123475 NA NA 44 0.0114238 C00023 Fe2+ 10 0.0123475 NA NA NA NA Table A.83: The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance phase’ using the Recon 1 model in comparison to the adipocyte and EHMN model. A.8.3 Comparison of expression data The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model. 114 APPENDIX A. RESULTS OF ALL SELECTED DATASETS Adipocyte baseline vs. reduction KEGG ID baseline vs. maintenance Metabolite name Rank P-value Rank 189 P-value C01036 4-Maleylacetoacetate 1 0.005185 C00536 Inorganic triphosphate 2 0.00926387 C00111 Dihydroxyacetone phosphate 3 0.0115888 144 0.465898 C00606 3-Sulfino-L-alanine 4 0.0138944 234 0.752851 C00097 L-Cysteine 5 0.0164945 116 0.364113 C01179 3-(4-Hydroxyphenyl)pyruvate 6 0.0220697 198 0.654706 C00122 Fumarate 7 0.0227082 108 0.331964 C00236 3-Phospho-D-glyceroyl phosphate 8 0.0243508 18 C00199 D-Ribulose 5-phosphate 9 0.0249655 209 0.682199 C03684 6-Pyruvoyl-5,6,7,8-tetrahydropterin 10 0.0267392 59 0.165882 15 0.620879 0.0329617 0.0445467 Table A.84: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight reduction’ and ’baseline vs. after weight maintenance phase’ based on the adipocyte model. baseline vs. maintenance KEGG ID Metabolite name baseline vs. reduction Rank P-value Rank P-value C00024 Acetyl-CoA 1 0.000563428 116 0.413092 C00010 Coenzyme A 2 0.00192595 169 0.583951 C00083 Malonyl-CoA 3 0.00537761 110 0.382474 heptadecenoyl CoA (C17:1CoA, n-8) 4 0.00968814 27 0.113801 1-Acyl-sn-glycerol 3-phosphate, adipocyte 5 0.00994868 37 0.14905 C01342 Ammonium 6 0.011238 244 0.805861 C00100 Propanoyl-CoA (C3:0CoA) 7 0.0129976 240 0.796681 C00197 3-Phospho-D-glycerate 8 0.0152006 128 0.44572 eicosadienoyl-CoA (C20:2CoA, n-6) 9 0.0163288 68 0.232662 docosenoyl-CoA (C22:1CoA, n-9) 10 0.0194561 70 0.233646 C05272 hexadecenoyl-CoA (C16:1CoA, n-9) 10 0.0194561 70 0.233646 C00510 octadecenoyl-CoA (C18:1CoA, n-7) 10 0.0194561 120 0.429406 Table A.85: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight maintenance phase’ and ’baseline vs. after weight reduction’ based on the adipocyte model. EHMN baseline vs. reduction KEGG ID baseline vs. maintenance Metabolite name Rank P-value Rank P-value C00412 Stearoyl-CoA 1 0.000119047 733 0.339824 CE0852 palmitoleoyl-CoA 2 0.000193774 1313 0.640054 C00003 NAD+ 3 0.000216192 19 C02249 Arachidonyl-CoA 4 0.000271037 1896 C00004 NADH 5 0.000309983 31 C01024 Hydroxymethylbilane 6 0.000391225 484 0.201965 C00577 D-Glyceraldehyde 7 0.000495807 472 0.194984 C02050 Linoleoyl-CoA 8 0.000546936 1247 0.603414 CE2254 docosanoyl-CoA 9 0.000760086 654 C00100 Propanoyl-CoA 9 0.000760086 16 CE0713 3-oxolinoleoyl-CoA 10 0.000765249 1773 0.0030949 0.902041 0.00819121 0.29664 0.00261942 0.8454 Table A.86: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight reduction’ and ’baseline vs. after weight maintenance phase’ based on the EHMN model. 115 APPENDIX A. RESULTS OF ALL SELECTED DATASETS baseline vs. maintenance KEGG ID Metabolite name baseline vs. reduction Rank P-value Rank P-value C00001 H2O 1 0.000000289791 1847 0.888155 C00124 D-Galactose 2 0.0000363024 2087 0.990244 C00267 alpha-D-Glucose 3 0.0000936421 2077 0.984357 C00031 D-Glucose 3 0.0000936421 1949 0.931825 C00006 NADP+ 4 0.000202882 34 C00064 L-Glutamine 5 0.000363629 1948 0.931821 C01290 beta-D-Galactosyl-1,4-beta-D-glucosylceramide 6 0.000459548 2067 0.980227 C01582 Galactose 6 0.000459548 321 0.140361 C01190 Glucosylceramide 7 0.000463858 211 0.0766204 C00195 N-Acylsphingosine 8 0.00048447 957 0.479916 C00005 NADPH 9 0.000500391 813 0.40689 C00024 Acetyl-CoA 10 0.000879614 166 0.060106 0.00740209 Table A.87: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight maintenance phase’ and ’baseline vs. after weight reduction’ based on the EHMN model. Recon 1 baseline vs. reduction KEGG ID baseline vs. maintenance Metabolite name Rank P-value Rank P-value C01024 Hydroxymethylbilane 1 0.000272881 261 0.216028 C05437 zymosterol 2 0.000387591 792 0.677306 C00906 5,6-Dihydrothymine 3 0.00163198 130 0.108296 C01036 4-Maleylacetoacetate 4 0.0023486 719 0.617654 C05446 3alpha,7alpha,12alpha,26-Tetrahydroxy-5beta-cholestane 5 0.00301134 45 0.0364003 C05467 3alpha,7alpha,12alpha-Trihydroxy-5beta-24-oxocholestanoyl-CoA 5 0.00301134 45 0.0364003 C01051 Uroporphyrinogen III 6 0.00485445 102 0.0817576 C00536 Inorganic triphosphate 7 0.00493835 54 0.0405944 C03845 Zymostenol 8 0.00514724 977 0.81583 C00003 Nicotinamide adenine dinucleotide 9 0.0051958 313 0.247902 C00332 Acetoacetyl-CoA 1143 0.932488 10 0.00570785 Table A.88: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight reduction’ and ’baseline vs. after weight maintenance phase’ based on the Recon 1 model. baseline vs. maintenance KEGG ID Metabolite name baseline vs. reduction Rank P-value Rank P-value C00001 H2O 1 0.000273347 450 0.363208 C00080 H+ 2 0.000603841 305 0.222867 C00006 Nicotinamide adenine dinucleotide phosphate 3 0.00380984 195 0.149033 C00195 ceramide (homo sapiens) 4 0.00432081 111 0.0722453 C00005 Nicotinamide adenine dinucleotide phosphate - reduced 5 0.00583327 248 0.186584 C00031 D-Glucose 6 0.00871297 815 0.682264 G00163 heparan sulfate, precursor 2 7 0.00982322 126 0.0881313 G00164 heparan sulfate, precursor 3 7 0.00982322 126 0.0881313 G00165 heparan sulfate, precursor 4 7 0.00982322 126 0.0881313 heparan sulfate, precursor 5 7 0.00982322 126 0.0881313 heparan sulfate, precursor 6 7 0.00982322 126 0.0881313 heparan sulfate, precursor 7 7 0.00982322 126 0.0881313 heparan sulfate, precursor 8 7 0.00982322 126 0.0881313 heparan sulfate, precursor 9 8 0.0106507 13 0.0102711 de-Fuc form of PA6 (w/o peptide linkage) 9 0.0115137 116 0.0805447 keratan sulfate I, degradation product 2 9 0.0115137 116 0.0805447 N-Acetyl-beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(N-acetyl- 9 0.0115137 116 0.0805447 n2m2nmasn (w/o peptide linkage) 9 0.0115137 116 0.0805447 protein-linked asparagine residue (N-glycosylation site) 9 0.0115137 116 0.0805447 G00019 beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,6)-(N-acetyl-beta-Dglucosaminyl-1,4)-beta-D-mannosyl-1,4-N-acetyl-beta-D-glucosaminyl-R C00237 Carbon monoxide 10 0.0123475 771 0.636823 C00023 Fe2+ 10 0.0123475 771 0.636823 Table A.89: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight maintenance phase’ and ’baseline vs. after weight reduction’ based on the Recon 1 model. 116 APPENDIX - LIST OF TABLES Appendix - List of Tables A.1 The top 10 differentially expressed genes from the comparison ’before vs. after energy restriction’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 66 A.2 The top 10 differentially expressed genes from the comparison ’after energy restriction vs. after weight stabilization’ with the corresponding pathways and reporter metabolites. 66 A.3 The top 10 differentially expressed genes from the comparison ’before dietary intervention vs. after weight stabilization’ with the corresponding pathways and reporter metabolites. 67 A.4 The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 68 A.5 The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 68 A.6 The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 68 A.7 The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A.8 The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A.9 The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A.10 The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 A.11 The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 70 APPENDIX - LIST OF TABLES A.12 The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 A.13 The comparison of the top 10 reporter metabolites between ’before dietary intervention vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the adipocyte model. . . 71 A.14 The comparison of the top 10 reporter metabolites between ’before vs. after energy restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the adipocyte model. 72 A.15 The comparison of the top 10 reporter metabolites between ’after energy restriction vs. after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’before vs. after energy restriction (ER)’ based on the adipocyte model. . . . 72 A.16 The comparison of the top 10 reporter metabolites between ’before dietary intervention vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the EHMN model. . . . 73 A.17 The comparison of the top 10 reporter metabolites between ’before vs. after energy restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the EHMN model. 73 A.18 The comparison of the top 10 reporter metabolites between ’after energy restriction vs. after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’before vs. after energy restriction (ER)’ based on the EHMN model. . . . . . 74 A.19 The comparison of the top 10 reporter metabolites between ’before dietary intervention vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the Recon 1 model. . . . 74 A.20 The comparison of the top 10 reporter metabolites between ’before vs. after energy restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy restriction vs. after weight stabilization (WS)’ based on the Recon 1 model. 74 A.21 The comparison of the top 10 reporter metabolites between ’after energy restriction vs. after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’before vs. after energy restriction (ER)’ based on the Recon 1 model. . . . . 75 A.22 The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin sensitive omental tissue’ with the corresponding pathways and reporter metabolites. 77 A.23 The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 77 APPENDIX - LIST OF TABLES A.24 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive omental tissue’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 A.25 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive omental tissue’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 A.26 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive omental tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 A.27 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 A.28 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 A.29 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 A.30 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 A.31 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 A.32 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 A.33 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 A.34 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 A.35 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 82 APPENDIX - LIST OF TABLES A.36 The top 10 differentially expressed genes from the comparison ’active vs. non-active’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . 84 A.37 The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . 84 A.38 The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . 85 A.39 The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . 85 A.40 The top 10 differentially expressed genes from the comparison ’African Americans vs. Hispanics’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 87 A.41 The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 87 A.42 The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 88 A.43 The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 88 A.44 The top 10 differentially expressed genes from the comparison ’WM - before LCD vs. after LCD’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 90 A.45 The top 10 differentially expressed genes from the comparison ’WR - before LCD vs. after LCD’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 91 A.46 The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 92 A.47 The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 92 A.48 The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 93 A.49 The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 93 A.50 The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 94 A.51 The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 94 A.52 The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after LCD’ and ’WR - before LCD vs. after LCD’ based on the adipocyte model. . . . . . . . 95 A.53 The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after LCD’ and ’WM - before LCD vs. after LCD’ based on the adipocyte model. . . . . . . . 120 95 APPENDIX - LIST OF TABLES A.54 The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after LCD’ and ’WR - before LCD vs. after LCD’ based on the EHMN model. . . . . . . . . 96 A.55 The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after LCD’ and ’WM - before LCD vs. after LCD’ based on the EHMN model. . . . . . . . . 96 A.56 The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after LCD’ and ’WR - before LCD vs. after LCD’ based on the Recon 1 model. . . . . . . . . 97 A.57 The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after LCD’ and ’WM - before LCD vs. after LCD’ based on the Recon 1 model. . . . . . . . . 97 A.58 The top 10 differentially expressed genes from the comparison ’day 0 vs. day 14’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . . . . 99 A.59 The top 10 differentially expressed genes from the comparison ’day 0 vs. day 56’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . . . . 100 A.60 The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . 101 A.61 The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . 101 A.62 The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . 101 A.63 The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . 102 A.64 The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . 102 A.65 The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . 103 A.66 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0 vs. day 56’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . 103 A.67 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0 vs. day 14’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . 104 A.68 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0 vs. day 56’ based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . 104 A.69 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0 vs. day 14’ based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A.70 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0 vs. day 56’ based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A.71 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0 vs. day 14’ based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . 106 121 APPENDIX - LIST OF TABLES A.72 The top 10 differentially expressed genes from the comparison ’normoxic vs. hypoxic conditions’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 107 A.73 The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 108 A.74 The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 108 A.75 The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 108 A.76 The top 10 differentially expressed genes from the comparison ’baseline vs. after weight reduction’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 110 A.77 The top 10 differentially expressed genes from the comparison ’baseline vs. after weight maintenance phase’ with the corresponding pathways and reporter metabolites. . . . . . 112 A.78 The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 112 A.79 The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 113 A.80 The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 113 A.81 The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance phase’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . 113 A.82 The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance phase’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . 114 A.83 The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance phase’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . 114 A.84 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight reduction’ and ’baseline vs. after weight maintenance phase’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.85 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight maintenance phase’ and ’baseline vs. after weight reduction’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.86 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight reduction’ and ’baseline vs. after weight maintenance phase’ based on the EHMN model. 115 A.87 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight maintenance phase’ and ’baseline vs. after weight reduction’ based on the EHMN model. 116 A.88 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight reduction’ and ’baseline vs. after weight maintenance phase’ based on the Recon 1 model.116 122 APPENDIX - LIST OF TABLES A.89 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight maintenance phase’ and ’baseline vs. after weight reduction’ based on the Recon 1 model.116 123 Acknowledgement I would like to express my gratitude to several people, without whose support it would not have been possible to write this diploma thesis. First of all, I would like to thank Univ.-Prof. Dr. Zlatko Trajanoski, director of the Section of Bioinformatics of the Medical University Innsbruck, who gave me the possibility to work on this interesting topic. Moreover, I want to express my gratitude to both, Univ.-Prof. DI Dr. Zlatko Trajanoski and DI(FH) Dr. Stephan Pabinger, Section of Bioinformatics of the Medical University Innsbruck, for the excellent support and guidance during the last nine month. I also want to thank Univ.-Prof. Dr. habil. Matthias Dehmer, head of the Institute for Bioinformatics and Translational Research at the UMIT, for being my supervisor on the part of the UMIT. In addition, I want to thank all my friends for encouraging and motivating me. Finally, I want to say a special thank to my parents, who enabled me the studies at the UMIT. 124 Statutory declaration Eidesstattliche Erklärung I hereby declare that this diploma thesis has been written only by the undersigned and without any assistance from third parties. Furthermore, I confirm that no sources have been used in the preparation of this thesis other than those indicated in the thesis itself. Hiermit erkläre ich an Eides statt, die Arbeit selbstständig verfasst und keine anderen als die angegebenen Hilfsmittel verwendet zu haben. ...................................................... Signature/Unterschrift 125
© Copyright 2026 Paperzz