Mapping gene expression data onto genome

Section of Bioinformatics
Mapping gene expression data onto genome-scale human
metabolic models
A thesis submitted in
Partial Fulfillment of the Requirements for the
Degree of
”Diplom Ingenieur in Biomedical Informatics”
at the
University for Health Sciences, Medical Informatics and Technology
by
Michaela Willi
Supervisors
Univ.-Prof. DI Dr. Zlatko Trajanoski
DI(FH) Dr. Stephan Pabinger
Univ.-Prof. Dr. habil. Matthias Dehmer
Hall in Tyrol, August 2012
Confirmation of the Supervisor
Betreuerbestätigung
I hereby declare to have supervised the present thesis and consequently approve its submission with a
positive assessment.
Hiermit bestätige ich die vorliegende Abschlussarbeit betreut zu haben und ich befürworte damit die
Abgabe der von mir insgesamt positiv benoteten Arbeit.
..............................................................................
Date and signature of the supervisor
Datum und Unterschrift des Betreuers
..............................................................................
Name of the supervisor in upper-case letters
Name des Betreuers in Blockbuchstaben
Acceptance by the study management
Annahme durch das Studienmanagement
date/am .......................................
by/von ........................................
i
Abstract
Many diseases, such as diabetes, are caused by malfunctions of the human metabolism. Obesity is
worldwide a growing health problem and provokes many metabolic diseases. The reconstruction of
human metabolic models enable a better research of these underlying functions within the metabolism
and a better approach of computable applications for analysis and visualization of the metabolism. So
far, two different human metabolic networks, the Edinburgh Human Metabolic Network (EHMN) and
the Human Recon 1, as well as several tissue-specific models have been published.
The objective of the thesis was mapping gene expression data onto genome-scale metabolic models.
For this purpose two different approaches were done in this thesis: the creation of two tissue-specific
models and the detection of reporter metabolites. First of all, adipose and liver tissue datasets were
selected, preprocessed in R, and the GIMME algorithm (COBRA toolbox) was applied to create the
tissue-specific models. For the second approach, eight datasets from adipose tissue were chosen and
differential expression was carried out using the limma package in R. The differentially expressed genes of
each dataset in combination with each of the three genome-scale metabolic models, adipocyte, EHMN,
and Recon 1, were used as inputs for the reporter metabolites analysis.
These newly created tissue-specific models were compared with already published genome-scale metabolic models of the adipocyte and liver. The top 10 differentially expressed genes were shown in tables
with their corresponding pathways. For the comparison of the resulting reporter metabolites the KEGG
COMPOUND and GLYCAN IDs were added manually to the Recon 1 and adipocyte model. Moreover,
the pathways between the top 10 ranked reporter metabolites and the top 10 ranked differentially expressed genes (regarding their p-value) were compared.
Differences in the ranking of the reporter metabolites occurred in all comparisons. Nevertheless, good
matches could be obtained as well. In addition, accordances between the pathways of the top 10 genes
and metabolites could be obtained. Depending on the kind of comparison, the different ranking is
caused by several reasons: (i) the differences between the gene expression data; (ii) the three unequal
genome-scale metabolic models; (iii) the incompleteness of the IDs of the Recon 1 and adipocyte model
(iv) as well as the internal IDs of the EHMN model.
In conclusion genome-scale metabolic models contain a lot of biological information, hence they are a
powerful tool to study the human metabolism as well as metabolic diseases. Increasing attention is
paid to cell- and tissue-specific models to get more precise metabolic models of the human key tissues
and cells.
ii
Zusammenfassung
Viele Erkrankungen, wie zum Beispiel Diabetes, sind Auswirkungen von Fehlfunktionen des menschlichen Metabolismus. Derzeit stellt Übergewicht ein weltweit stetig wachsendes Gesundheitsrisiko mit
vielen metabolischen Folgeerkrankungen dar. Um die zugrundeliegenden Mechanismen des Metabolismus genauer zu erforschen und mit Hilfe von computergestützten Anwendungen besser zu analysieren
und visualisieren, wurden metabolische Netzwerke des Menschen erstellt. Derzeit wurden zwei Modelle
des gesamten menschlichen Metabolismus, Edinburgh Human Metabolic Network (EHMN) und Recon
1, sowie mehrere gewebs- und zellspezifische Netzwerke publiziert.
Die Zielsetzung der Diplomarbeit ist das Abbilden von Genexpressionsdaten auf genombasierte metabolische Modelle. Für diesen Zweck wurden zwei verschiedene Ansätze ausgearbeitet: die Erstellung
zweier gewebs- bzw. zellspezifischer Netzwerke und die Analyse von wichtigen Metaboliten, sogenannten
’reporter metabolites’. Zuerst wurden Datensätze über Fett- und Lebergewebe ausgewählt und in R
vorverarbeitet. Die Anwendung des GIMME Algorithmus (COBRA toolbox) lieferte die gewebs- bzw.
zellspezifischen Netzwerke zurück. Für den zweiten Ansatz erfolgte die Auswahl von acht Datensätzen
über adipöses Gewebe. Die anschließende Genexpressionsanalyse wurde in R unter Anwendung des
limma Pakets durchgeführt. Die differenziell expremierten Gene jedes Datensatzes wurden mit jedem
der drei genombasierten metabolischen Modelle, EHMN, Recon 1, und das der Fettzelle, kombiniert
und als Eingabe für die Analyse der wichtigen Metaboliten verwendet.
Die erstellten gewebs- bzw. zellspezifischen Netzwerke, der Fettzelle und der Leber, wurden mit bereits
publizierten Netzwerken verglichen. Die Darstellung der zehn bestgereihten differenziell expremierten
Gene (bezüglich der p-Werte) mit den dazugehörigen Stoffwechselwegen erfolgte in Form von Tabellen. Um einen Vergleich der resultierenden wichtigen Metaboliten zu ermöglichen, wurden die KEGG
COMPOUND und GLYCAN IDs zum Recon 1 und Fettzellen Modell hinzugefügt. Außerdem fand ein
Vergleich der Stoffwechselwege zwischen den zehn bestgereihten Genen und Metaboliten statt.
Unterschiede zwischen den Reihungen der Metaboliten konnten in allen Vergleichen festgestellt werden.
Dennoch traten auch sehr ähnliche Reihungen auf. Weiters konnten Übereinstimmungen zwischen den
Stoffwechselwegen der Gene und der Meatboliten beobachtet werden. Abhängig von der Art des Vergleiches hat die unterschiedliche Reihung der Metaboliten verschiedene Gründe: (i) Unterschiede zwischen
den Genexpressionsdaten; (ii) die Verwendung von drei verschiedenen genombasierten metabolischen
Modellen; (iii) die Unvollständigkeit der IDs des Recon 1 und Fettzellen Modells, (iv) aber auch die
internen IDs des EHMN Modells.
Zusammenfassend beinhalten genombasierte metabolische Modelle viele biologische Informationen und
sind somit sehr gut geeignet um den menschlichen Metabolismus, aber auch metabolische Erkrankungen zu untersuchen. Gewebs- bzw. zellspezifische Modelle erlangen zunehmende Aufmerksamkeit um
präzisere Informationen über die wichtigen menschlichen Gewebe und Zellen zu bekommen.
iii
CONTENTS
Contents
1 Introduction
1.1
1
Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 State of the art
2.1
2.2
2.3
4
Metabolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.1.1
Metabolic networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.1.2
Metabolic pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
Network Visualization and Analysis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2.1
Network approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.2.2
Kinetic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.2.3
Stoichiometric approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
Stoichiometric matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
Elementary flux mode (EFM) and extreme pathways . . . . . . . . . . . . . . . .
10
Flux balance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
Genome-Scale Metabolic Models and the Constraint-based approach . . . . . . . . . . .
13
2.3.1
Genome-Scale Metabolic Models . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.3.2
Constraint-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.3.3
Human metabolic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
Cell- and tissue-specific models . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
Systems Biology Markup Language . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3 Methods
3.1
3
17
Toolboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
3.1.1
COBRA toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
GIMME algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
Reporter metabolites algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.1.2
TIGER toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
3.1.3
OptFlux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.1.4
BioMet toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
iv
CONTENTS
Reporter Features algorithm
3.2
3.3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
R and Bioconductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.2.1
Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Background adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
3.2.2
GEOquery
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
3.2.3
Presence/Absence calls from Negative Probesets . . . . . . . . . . . . . . . . . .
27
3.2.4
Linear Models for Microarray Data . . . . . . . . . . . . . . . . . . . . . . . . . .
29
Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.3.1
ArrayExpress database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.3.2
BiGG database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.3.3
CheBI database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.3.4
GEO database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
3.3.5
Human Metabolome Database . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
3.3.6
KEGG databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
4 Results
34
4.1
Comparison of toolboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
4.2
Creating tissue-specific models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
4.2.1
Expression data and preprocessing in R . . . . . . . . . . . . . . . . . . . . . . .
35
4.2.2
Presence/Absence calls from Negative Probesets . . . . . . . . . . . . . . . . . .
35
4.2.3
Final model generation using the GIMME algorithm . . . . . . . . . . . . . . . .
36
Gene expression data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
4.3.1
Obtaining expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
4.3.2
Calculation of differential expression . . . . . . . . . . . . . . . . . . . . . . . . .
40
Reporter metabolites analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
4.4.1
Reporter Features Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
4.4.2
Adapting human Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
4.4.3
Comparison of the reporter metabolites . . . . . . . . . . . . . . . . . . . . . . .
45
Overlapping metabolites over all datasets . . . . . . . . . . . . . . . . . . . . . .
45
Overlapping metabolites in each model . . . . . . . . . . . . . . . . . . . . . . . .
46
4.3
4.4
Differential gene expression in adipose tissue from obese human subjects during
weight loss and weight maintenance (GSE35411) - Adipocyte model . .
46
5 Discussion
49
List of Figures
53
v
CONTENTS
List of Tables
55
Bibliography
56
A Results of all selected datasets
64
A.1 Gene expression in adipose tissue during weight loss (GSE11975) . . . . . . . . . . . . .
65
A.1.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
Before vs. after energy restriction (ER) . . . . . . . . . . . . . . . . . . . . . . .
65
After energy restriction vs. after weight stabilization (WS) . . . . . . . . . . . .
66
Before dietary intervention vs. after weight stabilization (DI) . . . . . . . . . . .
67
A.1.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
Before vs. after energy restriction (ER) . . . . . . . . . . . . . . . . . . . . . . .
68
After energy restriction vs. after weight stabilization (WS) . . . . . . . . . . . .
69
Before dietary intervention vs. after weight stabilization (DI) . . . . . . . . . . .
70
A.1.3 Comparison between expression data . . . . . . . . . . . . . . . . . . . . . . . . .
71
Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
A.2 Expression data from human adipose tissue (GSE15773) . . . . . . . . . . . . . . . . . .
76
A.2.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
Insulin resistant vs. insulin sensitive omental tissue . . . . . . . . . . . . . . . . .
76
Insulin resistant vs. insulin sensitive subcutaneous tissue
. . . . . . . . . . . . .
77
A.2.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
Insulin resistant vs. insulin sensitive omental tissue . . . . . . . . . . . . . . . . .
78
Insulin resistant vs. insulin sensitive subcutaneous tissue
. . . . . . . . . . . . .
79
A.2.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
A.3 Genome-wide analysis of adipose tissue gene expression in twin-pairs discordant for physical activity for over 30 years (GSE20536) . . . . . . . . . . . . . . . . . . . . . . . . . .
83
A.3.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
Active vs. non-active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
A.3.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
Active vs. non-active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
A.4 Differences in subcutaneous adipose tissue gene expression between obese African Americans and Hispanic Youths (GSE23506) . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
A.4.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
vi
CONTENTS
African Americans vs. Hispanics . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
A.4.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
African Americans vs. Hispanics . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
A.5 Subcutaneous adipose tissue: comparison of weight maintenance and weight regain following an 8-week low calorie diet (GSE24432) . . . . . . . . . . . . . . . . . . . . . . . .
89
A.5.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
Weight maintenance - before low calorie diet vs. after low calorie diet . . . . . .
89
Weight regainer - before low calorie diet vs. after low calorie diet . . . . . . . . .
90
A.5.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
Weight maintenance - before low calorie diet vs. after low calorie diet . . . . . .
92
Weight regainer - before low calorie diet vs. after low calorie diet . . . . . . . . .
93
A.5.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
A.6 Characterization of the initial molecular events of adipose tissue development and growth
during overfeeding in humans (GSE28005) . . . . . . . . . . . . . . . . . . . . . . . . . .
98
A.6.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
Day 0 vs. day 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
Day 0 vs. day 56 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
A.6.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Day 0 vs. day 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Day 0 vs. day 56 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.6.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.7 Hypoxia-induced modulation of gene expression in human adipocytes (GSE34007) . . . 107
A.7.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Normoxic vs. hypoxic conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
A.7.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Normoxic vs. hypoxic conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
A.8 Differential gene expression in adipose tissue from obese human subjects during weight
loss and weight maintenance (GSE35411) . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.8.1 Differentially expressed genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Baseline vs. after weight reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Baseline vs. after weight maintenance phase . . . . . . . . . . . . . . . . . . . . . 111
vii
CONTENTS
A.8.2 Comparison between the models . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Baseline vs. after weight reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Baseline vs. after weight maintenance phase . . . . . . . . . . . . . . . . . . . . . 113
A.8.3 Comparison of expression data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Adipocyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
EHMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Recon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Appendix - List of Tables
117
Acknowledgement
124
Statutory declaration/Eidesstattliche Erklärung
125
viii
CHAPTER 1. INTRODUCTION
Chapter 1
Introduction
Metabolism is the core cellular function [1] and describes a set of reactions, which includes degradation,
build and interaction of macromolecules [2]. A large number of those metabolic genes and enzymes have
been studied individually for a long time to get all information resulting in the existing knowledge base,
called bibliome. Bibliome describes the interactions and reactions of metabolic genes and enzymes [3].
However, bibliomic data are not enough to analyse the molecular activity within a cell. One way is
to build models that represent the interactions among all components, where genome-scale in silico
models represent a powerful method [4]. Building such a model includes on the one hand a bottom-up
reconstruction by using genomic experimental and bibliomic data, and on the other hand an iterative
improvement [3][5][6]. The first genome-scale in silico metabolic model for eukaryotic cells was the
reconstruction of Saccharomyces cerevisiae, which led to a better understanding of the eukaryotic
cellular behaviour [7]. So far two different human metabolic networks have been reconstructed: the
Edinburgh Human Metabolic Network (EHMN) by Ma et al. [8] and the Human Recon 1 by Duarte
et al. [3].
These advancements lead to new opportunities in research of diseases and drug development. Due to
the fact that metabolism is affected by genetics, environmental and nutritional impacts, the appearance
of malfunctions is a main contributor to human diseases [3][5]. By taking a quick survey the Online
Mendelian Inheritance in Man (OMIM) shows, that 23% of metabolic genes are disease related and
that 48% of metabolic reactions are influenced by those disease related genes. A lot of diseases can be
classified as metabolic diseases, like cardiovascular illnesses, cancer and diabetes [9]. For instance, it is
a known fact that during tumor development, cancer cells modify the human metabolism [10] and that
there is an existing association between the central carbon metabolism and cancer development [1].
As mentioned, genome-scale in silico models are used to predict and develop new drug targets [1][10].
The research of interactions between drugs and the metabolic system offers a new perspective for
1
CHAPTER 1. INTRODUCTION
discovering drug targets [9] making them more personalized and more adapted to a specific disease.
Consequently a better treatment with drugs will be possible.
In future that genome-scale in silico models are useful for modeling metabolic pathways, but moreover
it represents a powerful tool for researching diseases in order to make earlier diagnosis of illnesses and
in consequence to start treatment earlier [4]. Furthermore, it is also relevant to develop more efficient
anticancer drugs [10] to supply a better treatment for patients.
Due to the actuality of the fact, that obesity is worldwide a growing health problem and moreover,
provokes metabolic diseases, the focus for this thesis lied on adipose tissue datasets.
The number of obese people has doubled since 1980 and today obese is ranked fifth of the leading
risks for global death. The prevalence for overweight and obesity is increasing worldwide, in high- and
middle-income countries as well as in low-income countries. Moreover, it is a matter of fact that 65%
of the worldwide population is living in countries, where more deaths are caused by overweight and
obesity than by underweight and undernourishment [11].
The main reason for overweight and obesity is the combination of an increased intake of energy and
a decrease in physical activity [11]. The Body-Mass-Index (BMI) is widely used to classify people as
overweight or obese and is calculated by the weight of a person (in kg) divided by the square of the
height (in meters). The resulting number defines people as normal (18.5 - 25), underweight (under
18.5), overweight (25 - 30), or obese (above 30) [12].
Amongst other obesity is associated with several health consequences, such as noncommunicable diseases [11]:
- Cardiovascular disease
- Type 2 diabetes
- Musculoskeletal disorders
- Special kinds of cancer, such as breast, colon and endometrial cancer
- Insulin resistance
- Hypertension
At cellular level, obesity causes a modification and dysfunctions in adipose tissue including macrophage infiltration, inflammation, and fibrosis [13]. Moreover, it introduces changes in the fatty acid
metabolism. Leading to an increased fatty acid flux in adipose tissue and furthermore to metabolic
dysfunctions in liver and skeletal metabolism [14].
2
CHAPTER 1. INTRODUCTION
Up to now it is known that adipose tissue plays an important role in multiple human metabolic pathways. The inflammatory pathways and the macrophage infiltration in the adipose tissue are interconnected with obesity [15].
Due to the multitude of associated disease, a lot of scientific work has to be done to obtain more
knowledge about obesity and resulting health problems.
1.1
Objectives
The goal of the thesis is mapping gene expression data onto genome-scale metabolic models in order to
explore pathophysiological modifications in adipose tissue.
Specific aims of the thesis are:
1. Construction of two human tissue-specific models for liver and adipocyte
2. Collection and preprocessing of gene expression data from human adipose tissue
3. Adaption of human metabolic models
4. Illustration of the differences between models and datasets
3
CHAPTER 2. STATE OF THE ART
Chapter 2
State of the art
2.1
Metabolism
All cells of living organisms, regardless of whether prokaryotes or eukaryotes, possess their individual
metabolism. Metabolism itself is a highly organized process and consists of many interconnected chemical reactions, which are catalyzed by enzymes controlling the generation and decomposition of macromolecules [2]. For instance, these chemical reactions regulate the following processes: the generation
of membranes, replication and repair of DNA, development of new cells, transport processes as well as
the transformation of one molecule to another [2][16][17]. These are only a few examples for metabolic
processes, but they show the importance of metabolism for vital functions [1][3][18].
In addition to mentioned processes, metabolism is responsible for producing energy and generating
components for the organism [2] leading to the two main categories of chemical reactions: catabolism
and anabolism. Catabolism decomposes macromolecules and yields energy, while anabolism uses energy
to build new cellular components [18][19].
2.1.1
Metabolic networks
The procedures of relating reactions can be illustrated in so-called metabolic networks, which is a
connection of many pathways [20]. These maps are usually very complex and seem at first sight
incomprehensible, but it is important to give the attention to metabolic pathways to draw conclusions
[18].
4
CHAPTER 2. STATE OF THE ART
2.1.2
Metabolic pathways
A metabolic pathway is a stepwise proceeding [18] and can be generally categorized into a chain of
reactions [2]. As mentioned before, there are two categories of reactions. This leads to two kinds of
metabolic pathways, catabolic and anabolic pathways [18][19]. Each pathway has one entry point and
may have several exit points [20].
Metabolic pathways are often studied in lower organisms, because they are less complex and are easier
to characterize. Subsequently, one can compare these pathways with the ones of higher organisms to
draw conclusions about the similarity of functions [18][19].
2.2
Network Visualization and Analysis
The representation and analysis of metabolic networks can be divided into three main approaches:
network approach, stoichiometric approach, and kinetic approach. The result of each method uses
another level of detail and is in consequence dependent on different in input informations [21].
All three mentioned approaches are shown in figure 2.1, where the triangles represent two main features:
size of the system and level of detail. That implies that the network approach is well suited for dealing
with a huge size of the system, but offers only a small level of detail, while the kinetic approach
handles with a high level of detail, but only with a small part of the system. The third approach,
the stoichiometric approach, is a compromise of these approaches, with a medium size and a medium
input level of detail. As a general rule it can be started that, in order to get precise information
and increase the significance of results, the size of the system needs to be reduced and more detailed
information is needed. Using the network approach only qualitative predictions are possible, whereas
the stoichiometric and kinetic approaches allow quantitative predictions [21][22].
Figure 2.1: Illustration of the different approaches for visualizing and analysing metabolic networks
taken from [22].
5
CHAPTER 2. STATE OF THE ART
2.2.1
Network approach
Networks are used to represent ’real-world’ objects and the networks in systems biology follow mainly
these four aims [23]:
- Visualisation of complex biological structures
- Interpretation of the network as a model, based on the used mathematical approach
- Representation of the network as a data structure to extract biological information
- Promoting a more comprehensive knowledge of biological structures and processes by generating
new networks
Graphs are basically directed or undirected. A simple network consists of nodes (vertices), representing
for example metabolites, and edges, representing the interactions between metabolites. Networks can
be assigned to the following networks classes [23]:
- Regular networks: Each node in this network has the same degree meaning that for example
each node is connected to three other nodes by three edges.
- Random networks: A random graph is characterized by the probability, which describes the
likelihood of an edge between two nodes.
- Directed acyclic graphs: A directed acyclic graphs is a special kind of a directed graph, which
does not contain direct cycles. For example ontologies within bioinformatics or systems biology
are visualized using directed acyclic graphs.
- Trees: A tree is a connected acyclic graph and two nodes are connected by only one path.
- Generalized Trees: An extension of the network class trees are generalized trees. In this
representation are nodes, edges and levels. Each node is referred to a level. In comparison to
trees, generalized trees allow edges between two nodes within one level and between nodes, where
one or more levels are skipped. Consequentially, there are more paths available to reach a node.
- Small-world networks: Small-world networks are characterized by a short path length and a
high clustering coefficient. This implies that one node can be reached from another node by using
only a small number of edges.
- Scale-free networks: Scale-free networks are observed in ’real-world’ networks. The degree
distribution follows the power-law, meaning that a newly added node is connected to already
existing nodes by a certain probability.
6
CHAPTER 2. STATE OF THE ART
For systems biology gene networks are interesting, because their nodes represent genes or gene products
and the edges molecular interactions. One type of gene networks are metabolic networks [23] and they
can be constructed as a bipartite graph with directed or undirected edges as shown in figure 2.2.
Bipartite means that there are two kinds of vertices, one set is for the representation of the reactions
and the other one for the representation of the metabolites [21][22][24]. Because of the two sets of nodes
such a graph can be separated into two subgraphs, one including the reactions and the other including
the metabolites [21][22]. The edges of this graph lead from the vertices of a metabolite to the vertices
of a reaction and reciprocal. This means that a metabolic node, which represents a compound that
should be catabolized, leads through the edge to a reaction node, which represents the enzyme, and
then to a metabolic node again, which represents a product [24].
Figure 2.2: In this figure a bipartite graph is represented. The circles represent the metabolite vertices
and the rectangles the reaction vertices. The figure is redrawn from [24].
For analysing such a metabolic network, different network descriptors for undirected or directed graphs,
are applicable to characterize structural properties [23]. For instance, connectivity, clustering coefficient
and various centrality measures are especially useful for undirected graphs. In contrast stoichiometric
properties and chokepoints are descriptors for directed metabolic graphs [24].
Topological properties can be calculated and interpreted as shown in the following examples: Metabolic
networks have a notable short average path length, which indicates a small-world property. This
measurement can be used to calculate properties such as [21][22]:
- The time for spreading information within the network
- The damage, which is caused by deletions of enzymes
- The differences between two compared organisms
- The hierarchical structure of the metabolic networks
7
CHAPTER 2. STATE OF THE ART
Examples show, that the degree distribution for metabolic graphs is very heterogeneous indicating a
scale-free network, where a few metabolites are strongly connected and take part in a lot of reactions.
These metabolites can be referred as hubs. Furthermore, the degree distribution is an indicator of the
robustness and error tolerance, for example the deletion of nodes or edges, of a metabolic network.
Another approach for identifying important components of a network are centrality measurements
[21][23].
The network approach has two main advantages: it is practicable for large-scale metabolic models and
only topological information is needed. There is no necessity for kinetic parameters, which allows the
analysis of less known organisms. Drawbacks are that metabolic graphs are bipartite graphs, which
requires a sophisticated analysis of such networks. Moreover, it is not possible to analyse dynamic
properties of the network and only qualitative descriptions can be calculated [22].
2.2.2
Kinetic approach
The kinetic approach, which is a quantitative evaluation of the properties of a metabolic network [22],
includes only single molecules and their interaction [2]. Therefore, this approach is limited to small
systems or pathways [22], whereas kinetic representation and analysis is the most detailed of the three
mentioned approaches [21].
The metabolic processes are described with the help of mass-balance equations or in terms of ordinary
differential equations [21][22]. The following figure 2.3 shows an example [22]:
Figure 2.3: Example of a minimal model of glycolysis to illustrate the kinetic approach [22]. A is the
reaction scheme and shows a graphical presentation of a minimal model of glycolysis. It shows that one
unit of glucose (G) is converted by reactions into two units of pyruvate (P ). B shows the stoichiometric
matrix N , which includes the information of the metabolites in their rows and the information about
the reactions in the columns. Gx , Px , and Glx represent external metabolites, which are not in the
stoichiometric matrix. C represents the reaction list of the model and D the dynamic mass-balance
equation or system of differential equations [21][22].
8
CHAPTER 2. STATE OF THE ART
As already mentioned, the advantage of this approach is that it returns very detailed results and
a quantitative prediction of the dynamic behaviour. However, detailed informations about enzymekinetic rate functions and kinetic parameters are required, but there is a lack of information about
these parameters and moreover, it is difficult to find reliable kinetic parameters even if they exist
[21][22].
2.2.3
Stoichiometric approach
The stoichiometric approach, which is based on a mathematical representation of the network by using
a stoichiometric matrix [2], takes structural properties as well as the constraints into account [22].
The stoichiometric approach is independent from kinetic information and the needed stoichiometric
information about a network is often available [21][22]. Moreover, it is also computable for largescale networks, more predictive than the network approach and the calculation returns quantitative
predictions, which are more suitable when analysing metabolic models. However, stoichiometric analysis
takes only steady state assumptions into account and therefore does not allow drawing of dynamic
conclusions. Furthermore, due to missing dynamic properties, predictions about allosteric regulations
are not possible [22].
Stoichiometric matrix
The stoichiometric matrix includes information about the structure of the network [2] and is essential for
predicting network functions [22]. The included information describes the connections and interactions
between molecules [22] and implying the amount and kind of molecules, which are consumed and
produced, can be interpreted [21].
The stoichiometric matrix is used to calculate the possible fluxes within a network at steady state.
The matrix itself is a m × n matrix, whereby m represent the metabolites (rows) and n the reactions
(columns) [2][22]. As the number of metabolites is usually smaller than the number of reactions and the
system of equation has generally no unique solution [21]. Following the linear equation 2.1 in steady
state [2][22]:
dS(t)
= 0 ⇒ N v(S, k) = 0
dt
m ... metabolites
n ... reactions
S ... m-dimensional time dependent vector of the metabolic concentrations
N ... m × n stoichiometric matrix
v ... n-dimensional vector of rate equations
k ... set of parameters
9
(2.1)
CHAPTER 2. STATE OF THE ART
Elementary flux mode (EFM) and extreme pathways
In this approach the stoichiometry is used to find all possible routes from one metabolite to another
[2] through a metabolic network. All these pathways, which are working together in a steady state
condition, are represented in flux modes. Moreover EFMs are unique for the network [22].
These elementary flux modes are a minimal set of flux vectors meaning that they cannot be further
simplified. An example is shown in figure 2.4. Existing software solutions allow the computation of
elementary flux modes, but because of overlapping pathways, which lead to exhaustive enumeration,
it is limited to medium sized metabolic models. Very similar to elementary flux modes are extreme
pathways. The difference is that extreme pathways allow in contrast to EFM irreversible reactions
[2][22].
Figure 2.4: The figure depicts two similar reaction networks and the corresponding elementary flux
modes. It is taken from [2].
Flux balance analysis
Flux balance analysis (FBA) [25] is a widely used and very popular approach to analyse large scale
metabolic networks [22]. It uses linear optimization to calculate the steady state flow through the
network to predict the optimal performance of a special organism or the optimal production rate of a
metabolite [25][26].
Flux balance analysis can be divided into of five steps, as shown in the figure 2.5 of Orth et al. [25].
The first step is the definition of reactions and followed by the definition of the stoichiometric coefficients
for each reaction to build a stoichiometric matrix N [25][26]. In addition constraints as well as lower
and upper bounds are defined.
10
CHAPTER 2. STATE OF THE ART
The constraints set by FBA are defined as follows [2][27]:
1. The assumption of steady state: N v(S, k) = 0
2. The irreversibility of reactions: v > 0
3. The limited capacity of enzymes to convert metabolites: v ≤ max
Figure 2.5: The figure illustrates the five steps of flux balance analysis [25].
In addition, FBA allows the definition of additional constraints belonging to one of four groups [26][28]:
1. Physico-chemical constraints, like reaction rate
2. Spatial or topological constraints, like growth of molecules inside the cell
3. Condition dependent environmental constraints, like temperature or nutrient availability
4. Regulatory constraints, like transcriptional and translational regulation or enzyme regulation
11
CHAPTER 2. STATE OF THE ART
Another important aspect is the definition of upper and lower bounds for each reaction as they define
the maximal respectively minimal allowable flux [25][26].
The third step is the set up of linear equations which may yield (more than one solution) due to the
higher number of reactions compared to metabolites [21].
The fourth step is setting the objective function and further calculate these fluxes, which maximize or
minimize the objective function within the solution space [25]. This is a critical step, because it influences the goal of the study [26], by defining how much each reaction contributes to the result [25]. There
are different kinds of objective functions, where the most common objective function is to maximize
the cell growth or biomass [26][27]. Other objective functions are for example, minimization of ATP
production, minimization of nutrient uptake, maximization of metabolite production, maximization of
biomass and metabolite production or optimal metabolite channeling [26]. The objective function is
defined in the following equation 2.2 [27]:
Z=
r
X
ci vi
(2.2)
i=1
Z ... objective function
ci ... weights for each flux
vi ... flux
The last step is to combine the mathematical representation of metabolic reactions and the objective
function to solve the system of linear equations by using linear programming [25][27].
The advantages of this approach are, that no kinetic parameters are needed and that the calculation
is quickly even for large-scale metabolic models. The drawback is that, due to the lack of kinetic
parameters, the results are less detailed. Another disadvantage is that there are only steady state
fluxes possible which limits the predictive significance [22][25]. Additionally, there are also four other
situations which cause problems. The first is the existence of parallel metabolic routes, which means
that two enzymes catabolize the same reaction, and reversible routes. In these cases the optimization
functions takes only the flux of both routes into account and it is not possible to handle the fluxes
separately. A similar problem are cyclic fluxes as they cannot be resolved either, since they have
no influence on the fluxes of a network [22]. Futile cycles are the last problematic case [22], which
occur if two opposite reactions are catalyzed by different enzymes at the same time, because then
the resulting energy disappears [29]. These cycles are normally not respected by using optimization
criterions, because they have no optimal solution and so they are not included in the results of flux
balance analysis, even if they are very common in many organisms [22].
12
CHAPTER 2. STATE OF THE ART
2.3
Genome-Scale Metabolic Models and the Constraint-based
approach
The stoichiometric approach is currently the most suitable approach for analysing metabolic networks
[21][22], which encouraged the reconstruction of genome-scale metabolic models.
2.3.1
Genome-Scale Metabolic Models
The first genome-scale metabolic models of viruses were constructed in the early 1990s [30]. A genomescale metabolic model describes the relationships of a metabolic network on a genotype-phenotype level
[6]. Biological network models are mostly bottom-up reconstructions, which means a component by
component construction, of genomic and bibliomic data. This leads to a biochemically, genetically
and genomically structured reconstruction (BiGG) of a tissue specific or non-tissue specific metabolic
network [5][31].
Existing genome-scale models are available in all three domains of life: archaea (single-cell microorganisms), bacteria (prokaryotic microorganism) and eukarya. The most studied genome-scale models [32]
are the one of Escherichia coli (bacteria) [33][34] and Saccharomyces cerevisiae (eukarya) [7].
The generation of a genome-scale metabolic model consists of four main steps [35][36]:
The first step includes collecting all the biological components, which are relevant for the reconstruction,
and the necessary information about them [32][35][36].
Next, the metabolic network is reconstructed by generating a metabolic reaction list that connects
selected components and the construction of the gene-protein reaction relationships defining the proteincomplex or protein that catalyzes a reaction [32][35].
The third step deals with the transformation of the reconstructed model into a mathematical representation [32][35][36]. The resulting stoichiometric matrix N is a m × n matrix, where m are the number of
metabolites and n are the number of reactions [32]. Each column of the matrix complies to a reaction
and every row to a metabolite. Positive numbers represent products and negative numbers substrates
[35]. This matrix can be used for computational calculations.
In the final step the network is evaluated. By comparing simulation results with published experimental
data [35][36] it ispossible to find mistakes, like missing metabolic functions and wrong assignments of
reversibility [35]. Usually, such a construction of a network is a iterative process and needs several
iterations to get the final result [7][32][35]. Therefore, it is very labor and time intensive until a model
is finished [35].
Generally, such a genome-scale metabolic models represent a BiGG knowledgebase and a mathematical
model (in silico) to enable constraint-based analysis.
13
CHAPTER 2. STATE OF THE ART
2.3.2
Constraint-based approach
For studying genome-scale metabolic models, different mathematical approaches are possible. As
already mentioned, many of these approaches need a lot of detailed kinetic parameters, resulting in a
lack of information and consequently in limited approaches [31][32][37]. Moreover, the aim of many of
these approaches is the prediction of the detailed network functionalities.
On the contrary, the constraint-based approach (CBM), is data driven using a mathematical model like
the genome-scale metabolic one. The goal of this approach is finding those network states that can be
achieved and simultaneously excluding all others [32]. It predicts properties of the network in silico [4]
and assumes a steady state [37], so the flow of metabolites through the network can be observed [38].
Using this, it is possible to identify gaps, those reactions, which don’t carry a flux [3][38]. By the use of
constraint-based modeling, the research of tissue-specific metabolic behaviour is possible [39], as well
as the identification of phenotypes of microorganisms, like growth rate, uptake of nutrients, product
secretion and the outcome of gene deletions [35][39].
The advantage is, that only physical-chemical and environmental constraints, like mass, energy, charge,
reaction fluxes or thermodynamics [4][31], are used [4][39][40].
2.3.3
Human metabolic models
So far two global human metabolic models were published in 2007 and those are widely used for
systematic studies: the homo sapiens Recon 1 model [3] and the Edinburgh Human Metabolic Network
(EHMN) [8][41]. Both models consist of compartmentalized metabolites, whereas EHMN includes these
compartments since the second published version in 2010 [41]. The following table 2.1 shows properties
of both models [3][41]:
Model
Reactions
Metabolites
Genes
Compartments
EHMN
6216
6522
2322
8
Recon1
3743
2766
1496
8
Table 2.1: Comparison of the human metabolic models.
The following figure 2.6 illustrates the four major applications of global human metabolic models [42]:
1. Gene expression data can be used for the reconstruction of cell- and tissue-specific models
2. Reconstruction of similar mammalian models
3. Interpretation of gene expression data by mapping them onto global human metabolic networks
4. Simulation of pathological and drug states
14
CHAPTER 2. STATE OF THE ART
Figure 2.6: The four major applications of global human metabolic models [42].
Cell- and tissue-specific models
Human metabolism cannot only be seen from the global point of view, moreover it is also important
to generate cell- and tissue-specific models to represent the metabolism by taking into account tissuespecific information [42][43]. The development of three different algorithms for developing cell- and
tissue-specific models accelerates the reconstruction of new cell- and tissue-specific models [43]. For
instance following cell- and tissue-specific models are already reconstructed:
- human liver [37][44]
- alveolar macrophage [43]
- kidney [45]
- adipocyte [48]
- brain [46]
- myocyte [48]
- erythrocyte [47]
- hepatocyte [48]
Systems Biology Markup Language
Systems Biology Markup Language (SBML) [49] is a format based on the Extensible Markup Language
(XML) for describing biological networks and processes, as pathways. Each component of the model is
defined in a specific list, whereas the definition of a component is optional. All lists are independent,
but, depending on the model complexity, dependencies among them could exist.
SBML Level 3 Version 1 is the recent release, whereby ’level’ defines the edition and ’version’ small
updates within a specific release. SBML is a computer-readable language, hence software packages can
translate SBML models into internal models and vice versa [49].
15
CHAPTER 2. STATE OF THE ART
The following figure 2.7 of Gianchandani et al. [50] illustrates metabolic network reconstructions
and analysis as an iterative workflow and simultaneously points out the connections between the last
chapters.
Figure 2.7: The reconstruction of a metabolic reaction network is done with data from literature and
gene-protein-reaction (GPR) relationships from experimental data. This information is converted into
a stoichiometric matrix for the following simulation step, which is an iterative process. Thereby flux
balance analysis is used to calculate the steady-state fluxes through the network using constraints. The
results are analysed and validated with the help of different methods. Subsequently these outcomes are
date of, for example, published literature or online databases, which might be used as new input for
metabolic network reconstructions [50].
16
CHAPTER 3. METHODS
Chapter 3
Methods
3.1
Toolboxes
3.1.1
COBRA toolbox
The constraint-based reconstruction and analysis toolbox (COBRA) is a MATLAB package for analysis, prediction, and simulation of phenotypes. It uses bottom-up constructed genome-scale metabolic
models, which are stored in Systems Biology Markup Language (SBML) file format and can be imported into MATLAB by converting them to a COBRA model. The MATLAB model consists of different
fields, which include amongst others:
- rxns: list of all reaction abbreviations
- mets: list of all metabolite abbreviations
- S: stoichiometric matrix
- rev: defines if reactions are reversible or not
- lb/ub: lower/upper bounds of reactions
- c: objective coefficients
- genes: list of all genes (optional)
- rxnGeneMat: reaction-gene matrix (optional)
- rxnNames: list of all reaction names (optional)
- metNames: list of all metabolite names (optional)
- metChEBIID/metKEGGID/metPubChemID/metInChIString: one list for each metabolite ID (optional)
17
CHAPTER 3. METHODS
The COBRA toolbox offers a wide range of different methods, and allows community members to
provide new add-ons. Figure 3.1 displays an overview of the COBRA toolbox, including seven categories
of COBRA methods and additional functionalities for reading and writing models, for testing the
toolbox, and for integrating different solver functionalities [5]:
Figure 3.1: Overview of the COBRA toolbox functionalities [5].
Most COBRA methods follow the constraint-based approach by returning a reduced set of solutions,
but no unique one. That requires the use of different constraints and metabolic objectives to calculate
possible network states under a defined set of conditions [5].
GIMME algorithm
The Gene Inactivity Moderated by Metabolism and Expression (GIMME) [51] algorithm offers the
possibility to algorithmically create a tissue-specific model out of genome-scale metabolic models and
expression data [42][51].
GIMME algorithm requires three different inputs as tab-separated text file:
- gene expression data
- a genome-scale reconstructed network
- one or more Required Metabolic Functionalities (RMF) defining the new model
18
CHAPTER 3. METHODS
The algorithm follows a two step procedure: the first step is the execution of FBA to calculate the
maximal possible fluxes through all RMFs. For the second step constraints of the RMFs are defined to
be at or above a minimum level, for instance a percentage of the maximum that is found in FBA. This
cutoff value for defining reactions as active or inactive is set by the user. However, it could occur that
a reaction is classified as inactive even if it is necessary to achieve the RMFs. To prevent this problem
the following linear optimization (formula 3.1) is used to find the most consistent set of reactions and
consequently to reactivate misclassified ones [51].
Minimize:
X
ci · |vi |
Subject to: S · v = 0
ai < vi < bi
(3.1)
where ci = xcutof f − xi where xcutof f > xi
0 otherwise
for all i
ci ... constraint
vi ... flux vector
S ... stoichiometric matrix
xcutof f ... cutoff value
xi ... normalized gene expression data mapped onto each reaction
ai , bi .. lower and upper bound of each reaction by taking into account the RMFs
The result of the GIMME algorithm is a reduced network with a minimal inconsistency score (IS),
which describes the disagreement between expression data and the objective function. To enable a
more intuitive interpretation, the IS values are converted to a normalized consistency score (NCS),
which characterize those gene expression data that fit to the objective function [51].
Reporter metabolites algorithm
The reporter metabolites algorithm by Patil and Nielsen [52] identifies metabolites having an important
function in metabolic regulation and highly correlated subnetworks. For this purpose gene expression
data are mapped onto genome-scale metabolic models to determine the so-called reporter metabolites.
19
CHAPTER 3. METHODS
The following figure 3.2 shows the step-wise procedure of the reporter metabolite algorithm:
Figure 3.2: Illustration of the step-wise procedure of the reporter metabolites algorithm [52].
Starting point is a genome-scale metabolic model and subsequently two new networks are derived, a
metabolic network and an enzyme interaction network. The metabolic network is a bipartite undirected
graph with metabolites and enzymes represented as nodes and their interactions illustrated as edges.
One metabolite is involved in one or more reactions and is consequently connected to all enzymes
catalyzing this reactions.
The enzyme interaction network is a unipartite graph, where enzymes are nodes and metabolites are
edges, meaning enzymes are connected that share a metabolite in a reaction.
The next step is mapping transcriptional data onto the enzyme nodes of both graphs. Two kinds of
transcriptional data can be used: differential data (e.g. the comparison of two different conditions) and
multidimensional data (e.g. the comparison of multiple conditions). Differential data are mapped onto
the enzyme nodes using student’s t-test to calculate p-values as result, where each p-value represents
the significance of the change of an enzyme. For multidimensional data the absolute pearson correlation
coefficient P is calculated for each edge between nodes. Both p-values and P -values follow a uniform
distribution and are therefore converted to Z scores by inverse normal cumulative distribution, called
normalized transcriptional response [52].
Reporter metabolites are finally identified by scoring each metabolite by the normalized transcriptional
response of its neighbour enzymes as illustrated in formula 3.2 [52]:
1 X
Zmetabolite = √
Zni/ej
k
20
(3.2)
CHAPTER 3. METHODS
Afterwards Zmetabolite scores are corrected for the background distribution. This scoring system defines
those metabolites with the highest score as reporter metabolites [52].
The last step is the identification of highly correlated subnetworks within an enzyme interaction network. However, this is a nondeterministic-polynomial-hard-problem, called clique problem, which describes the problem of finding a specific subgraph [52][53]. The reporter metabolites algorithm uses
simulate annealing as a heuristic approach to find a solution. However, there remain two difficulties: (i)
simulated annealing may not only return global optimal solutions, but also local ones; (ii) the resulting
subnetwork is depending of the initial conditions and parameters. To overcome this problems simulated
annealing algorithm is repeated ten times and the subnetwork with the highest score is selected [52].
3.1.2
TIGER toolbox
The Toolbox for Integrating Genome-scale Metabolism, Expression and Regulation (TIGER) [54] is a
MATLAB package that tries to improve three deficiencies of already existing toolboxes, like COBRA
toolbox [5], CellNetAnalyzer [55], and the BioMetToolbox [56]:
- Converting COBRA models and transcriptional regulatory networks (TRNs) into integrated optimization problems
- Integration of high-throughput expression data to analyse these integrated models by using existing algorithms
- Offering user the possibility for developing new algorithms based on these integrated models
The TIGER toolbox is compatible with COBRA models (figure 3.3):
Figure 3.3: The figure illustrates the conversion of a COBRA model into a TIGER model [54].
21
CHAPTER 3. METHODS
The conversion process allows adding boolean constraints, which are derived from Gene-Protein-Reactions (GPR). These GPRs are defined in boolean logic and describe the relationships between genes,
genes and their protein products and reactions. These boolean rules are step-wise converted to systems
of inequalities, then upper and lower bounds are added and the result is a mixed integer linear program
(MILP). The last step of converting a COBRA model into a TIGER model is mapping the converted
rules onto the COBRA model [54].
This TIGER model can be used for developing new algorithms, applying functionalities as such flux balance analysis, and creating context-specific networks using GIMME [51], iMat [39] or MADE (Metabolic
Adjustment by Differential Expression) algorithms [57].
3.1.3
OptFlux
OptFlux is a open-source software platform with the aim of providing a user friendly computational tool
for metabolic engineering applications. Metabolic engineering means optimizing the processes within
an organism to increase the production of a certain compound. In comparison to the COBRA and
TIGER toolboxes, OptFlux is a Java based modular program and provides a Graphical User Interface
(GUI) to enable a user friendly environment even for users with little knowledge in the research area
[58].
OptFlux provides a series of functionalities, which can be classified into four categories:
1. Model Handling: allows users to read models either as flat text files, from text files that follow
the Metatool format [59], or models using SBML standard.
2. Simulation module: offers methods for metabolic phenotype simulations. It includes different
methods such as FBA [25], Minimization of Metabolic Adjustment (MOMA) [60], Regulatory
on/off minimization of metabolic flux changes (ROOM) [61] and Metabolic Flux Analysis (MFA).
3. Optimization: the aim of those methods is the optimization of the objective function by identifying sets of reactions or genes , which have to be deleted for reaching the optimum. Implemented
methods for the optimization are OptKnock [62] and OptGene [63].
4. Pathway Analysis: provides the EFMTool [64], for elementary flux modes analysis, and the
possibility to export a flux to Cell Designer [65].
22
CHAPTER 3. METHODS
3.1.4
BioMet toolbox
BioMet toolbox [56] is a web-based toolbox offering three analysis tools:
- Reporter Features
- Reporter Subnetworks
- BioOpt
The purpose of the Reporter Features algorithm [66] is the identification of transcriptional regulatory
circuits in a metabolic network and is similar to reporter metabolites algorithm of Patil and Nielsen [52]
explained in chapter 3.1.1. Reporter Subnetworks is a derivation of the reporter metabolites algorithm
and identifies significant subnetworks as described detailed in chapter 3.1.1. Both tools, Reporter
Features and Reporter Subnetworks, use high-throughput data and genome-scale metabolic models for
predicting metabolic behaviours. The third tool, BioOpt, is a tool for conducting flux balance analysis
[56][66].
Reporter Features algorithm
The Reporter Features algorithm is a hypothesis driven algorithm for mapping gene expression data
onto genome-scale metabolic networks for identifying groups of neighbour genes, which are significantly
co-regulated in comparison to the others. The algorithm is a generalization and extension of the reporter
metabolites algorithm of Patil and Nielsen [66].
The algorithm (figure 3.4) needs three kinds of input data as tab-separated text files: gene expression
data, an interaction or annotation list, and a genome-scale metabolic network.
23
CHAPTER 3. METHODS
Figure 3.4: Illustration of the step-wise procedure of the Reporter Features algorithm [52].
The interaction or annotation list may contain Protein-DNA interactions, Protein-Protein interactions,
or GO annotations and can be represented as a bipartite graph. The algorithm considers genes and
metabolites as nodes and their interactions are illustrated using edges. One metabolite is involved into
one or more reactions, that are catalysed by enzymes [56][66].
Gene expression data contain a list of genes with their be p-values of pairwise comparisons or their
pearson correlation coefficients P in case of multidimensional data. In both cases inverse normal
cumulative distribution is used to convert the values into Z scores as they follow a normal standard
distribution [66].
The scoring system for scoring and ranking the features is based on the distribution of means of random
groups of the same size and is a test for the null hypothesis. The score of one metabolites depends
on the scores of the neighbours, because their Z values are summed up and divided by the number of
neighbours (formula 3.3) contrary to reporter metabolites (formula 3.2), where the summed up values
are divided by the root of the number of neighbours [52][66].
Zfeature j
N
1 X
=
Zni/ejk
N
(3.3)
K=1
The resulting Z score is corrected by subtracting the mean and dividing it by the standard deviation.
These Z scores are converted back to p-values by normal cumulative distribution, because the user
decides the significant p-value to define metabolites as reporter metabolites [52][66].
24
CHAPTER 3. METHODS
However, it is also possible to choose the option of higher-degree Reporters. The illustrated scoring
system is for first-degree Reporters.
The result of Reporter Features algorithm offers the possibility to look separately at up- and downregulated, only up-regulated or only down-regulated Reporter Features [66].
3.2
R and Bioconductor
R and the package Bioconductor offer data structures and functions for importing and processing
microarray data. One-colour microarrays include one set of probe levels per microarray whereas twocolour microarrays produce two sets of probe-levels (red and green) on each microarray. Amongst others
microarray data can be imported as CEL files or in the simple omnibus file format (SOFT). The use
of CEL files requires preprocessing steps before data can be used for further analysis. In the following
section three R packages are described more detailed [67].
Affymetrix GeneChip arrays (one-colour microarrays) are used for high-throughput gene expression
analysis [68]. An Affymetrix GeneChip contains short oligonucleotide, with a size of 25bp per gene.
Because of the small size multiple oligonucleotide probes, usually between eleven and twenty probe
pairs building one probeset for each gene, are used to increase specificity. Probe pairs contain one
perfect match (PM) strand, for specific hybridization, and one mismatch (MM) strand, for non-specific
hybridization. The non-specific hybridization is caused by integrating a non-specific component, which
is constucted by exchanging the thirteenth nucleobase with the complementary nucleobase [67].
3.2.1
Preprocessing
Preprocessing consists of three steps: background adjustment, normalization, and summarization. For
each step a wide range of methods are available, where three methods are commonly used.
Background adjustment
Background adjustment is an essential step as it has the largest influence on accuracy and precision
[69]. The aim is to increase the array intensity by adjusting intensity reading of non-specific signals
[70]. The default adjustment, provided as part of the Affymetrix system, can be described as difference
between PM and MM probe intensities [71][72].
MAS5 (Affymetrix Microarray Suite) is the default algorithm of Affymetrix using PM and MM probes
[73]. It is based on the robust average of log(P M −M M ∗ ) values, whereas M M ∗ means that corrections
25
CHAPTER 3. METHODS
are applied to avoid M M ∗ values less or equal to 0 [71]. For background adjustment the chip is divided
into a grid of sixteen rectangular regions and the lowest 2% probe intensity of each region is used for
calculating the background values. Afterwards each probe intensity is adjusted by using a weighted
average of each background value. The weights depend on the Euclidean distance between the probe
and the centroid of the grid [67].
The RMA (Robust Multiarray Analysis) approach includes all three steps of preprocessing: background
correction, quantile normalization, and summarization and uses only PM probes [67][72]. RMA is a
global background adjustment, implying that PM values are corrected probe cell by probe cell of the
mircoarray by using a global model for distribution [67]. This is done by fitting a Normal-Exponential
mixture model and subtracting a background estimate from the PM value of each probe. Thereby it is
guaranteed to get positive results. Afterwards, the values are log transformed [70].
It is demonstrated by Irizarry et al. [73] that RMA outperforms MAS5.
The GCRMA (Guanine-Cytosine Robust Multiarray Analysis) approach is similar to RMA, as it also
includes all preprocessing steps, and it uses the same approach for normalization and summarization.
The GCRMA method is based on the additive background-multiplicative-measurement error (ABME)
model for reading intensities from microarray scanners. The difference between RMA and GCRMA
is that GCRMA uses sequence information [67], to describe the non-specific binding component. It is
a matter of fact that guanine and cytosine have a stronger hybridization than adenine and thymine,
because guanine and cytosine have three hydrogen bounds and adenine and thymine only two [71].
Naef and Magnasco [74] developed a solution for predicting specific hybridization effects by modelling
probe affinities as a sum of position-dependent base effects. It is reported that GCRMA outperforms
RMA and MAS5 [71][72].
Normalization
The normalization step is necessary to compare measurements from different arrays, because many
sources cause variations. In RMA and GCRMA quantile normalization is used to get the same empirical
distribution of entities to each array. For the visualization of the result of the algorithm a quantilequantile plot is used for meaning that two data vectors having the same distribution will show a
straight diagonal line, with slope 1 and intercept 0. To achieve the same distribution for two datasets,
the quantiles of two data vectors are plotted against each other and then each data point is projected
onto the 45-degree line [67].
26
CHAPTER 3. METHODS
Summarization
The last task is summarizing the steps of preprocessing and it is necessary to combine multiple probe
intensities for each probeset to produce an expression value [67].
3.2.2
GEOquery
The GEOquery package provides an easy access to files in SOFT format and it enables to handle the
included information. Therefore, it supports the usage of public available high-throughput data for
Bioconductor analysis tools [75].
3.2.3
Presence/Absence calls from Negative Probesets
The R package ’Presence/Absence calls from Negative Probsets’, panp, is for the generation of gene
expression values and presence and absence calls. As a first measurement of chip or sample quality
the detection of the number of present or absent probesets can be applied. This first filtering step in
the process of analysing differentially expressed (DE) genes is only possible with two methods: MAS5
presence-absence method and panp method. MAS5 presence-absence method can be only used with
PM and MM probes implying that the MAS5 preprocessing method has to be used as well [68]. Other
preprocessing methods, such as RMA or GCRMA have been developed. Moreover, it is shown that
MM probes may have a negative impact on the result and may cause problems. The panp method was
developed to overcome this problem, because this method handles PM probes as well as PM and MM
probes [67].
Affymetrix GeneChip probesets are designed based on small oligonucleotide, also known as expression
sequence tags (ESTs), which are available in a public database. Some of these ESTs have the wrong
strand direction making then a reverse complement. These reverse complements are called ’Negative
Strand Matching Probesets’ (NSMPs) and are used as negative controls in the panp method for the
detection of the presence and absence calls [68].
For applying the panp method the data of a chip have to be preprocessed using a method, like RMA,
GCRMA or MAS5. The following decision making of the panp method is illustrated in figure 3.5 as
expression density plot. Therefore, the probability distribution of the signal intensities of the NSMPs
are calculated and further utilized to create a cumulative distribution function, which is converted to
a survivor distribution in order to derive a cutoff intensity at a given p-value. The horizontal lines
on the y-axis show two, by the user chosen, p-values cutoffs. The corresponding vertical lines are
the interpolated p-values cutoffs into a intensity, which classifies genes as present, marginal or absent.
Genes, with an itensity value below the most left cutoff line are absent, the genes with an intensity
value above the most right line are present and the genes with an intensity value between the two cutoff
27
CHAPTER 3. METHODS
lines are marginal. As usual, the lower the number of the p-value, the higher the significance [68].
Figure 3.5: The expression density for classifying genes a present, absent or marginal [68].
The panp method is available in R as one part of the Bioconductor package. The R function is named
pa.calls and requires an input object, ExpressionSet, and one loose cutoff (default 0.02) and one tight
cutoff (default 0.01). The function returns two matrices, one including the p-values and one including
indicators for presence (P), marginal (M), and absent (A) [68].
28
CHAPTER 3. METHODS
3.2.4
Linear Models for Microarray Data
Linear Models for Microarray Data (limma) is a package for analysis of differentially expressed genes of
data from microarray experiments. Therefore, a linear model is fitted onto expression data of each gene
making it possible to analyse simple and more complex experiments in a simple manner. Expression
data can be log-ratios or log-intensities from one- or two-colour channel arrays [67][70].
limma analysis starts with an already created eset dataset and needs two kinds of matrices: a design
matrix and a contrast matrix. The design matrix represents the different targets of an microarray
and the contrast matrix combines the coefficients of the design matrix to enable comparisons between
RNA targets of interest. The rows of the design matrix represent the arrays in the experiment and the
columns the coefficients. For simple comparisons it is not necessary to create a contrast matrix. The
contrast matrix may be created manually or by using the command model.matrix [67][70]:
Next a linear model has to be fitted on the data by using lmFit. The method lmFit combines these two
matrices to get estimated values for the contrast of interests. The next step is applying empirical Bayes
method, eBayes, to borrow information between the genes. At least differentially expressed genes can
be shown by using the topTable method [67][70].
There are a lot of different designs available to choose and adapt the appropriate one for the given
microarray data. Three of those designs will be explained more detailed: two groups comparison
against a common reference, two groups comparison of single channel microarrays and comparison of
paired samples [67][76].
The comparison of two groups against a common reference implies that a two-colour microarray is used
where one channel contains the common reference and the other channel the two different groups to
compare as it can be seen in table 3.1 adapted from [67][76].
FileName
Cy3
Cy5
File 1
Ref
WT
File 2
Ref
WT
File 3
Ref
Mu
File 4
Ref
Mu
File 5
Ref
Mu
Table 3.1: Target file of the comparison of two groups against a common reference. The table is adapted
from [67].
29
CHAPTER 3. METHODS
The design matrix contains two columns and there are two possibilities to create it:
1. The first column includes the difference of wild-type and reference and the second column the
difference between mutant and wild-type. In this case the contrast matrix is not necessary, because
the comparison is already included in the design matrix.
2. The second approach handles the coefficients separately and so the first column includes the comparison between mutant and reference and the second column includes the comparison between
wild-type and reference. Because of the missing comparison between mutant and wild-type in the
design matrix a contrast matrix has to be created.
Two groups comparison of single channel microarrays are done in the same way as the comparison of
two groups against a common reference. The differences of the target file can be seen in the following
table 3.2 [67][76]:
FileName
Target
File 1
WT
File 2
WT
File 3
Mu
File 4
Mu
File 5
Mu
Table 3.2: Target file of the comparison of two groups. The table is adapted from [67].
The third design is the analysis of paired samples. This kind of comparison is used to compare two
kinds of treatments, which means that two persons are compared directly. For instance, one person
receives treatment A and the other person receives treatment B or one person is treated and the other
person is the control [67][76]. Afterwards a moderated t-test is used, by including the pairs.
The target frame is created as it can be seen in table 3.3 [67][76]:
FileName
Group
Treatment
File 1
1
C
File 2
1
T
File 3
2
C
File 4
2
T
File 5
3
C
File 6
3
T
Table 3.3: Target file of the comparison of paired samples. The table is adapted from [67].
30
CHAPTER 3. METHODS
3.3
3.3.1
Databases
ArrayExpress database
ArrayExpress [77] is a public database for microarray gene expression data developed by the European
Bioinformatics Institute (EBI). It is possible to submit, query and export three kinds of data: arrays,
experiments, and protocols. The description of one dataset includes a short description, protocols,
information about the samples and the platform, the citation, an contact to the author and a link to
the GEO database. Expression data and other informations can be downloaded for further use as well
as analysed and visualized online using Expression Profiler [77][78].
3.3.2
BiGG database
Biochemically, genetically and genomically (BiGG) [31] structured database of metabolic reconstructions contains ten different genome-scale metabolic models. It offers the functionalities for searching
content within reactions and metabolites and exporting metabolic reconstructions as SBML files. It allows searching for metabolites, reactions, genes, proteins and literature citations. Furthermore, BiGG
database offers the possibility to visualize metabolic maps showing metabolites, reactions, and text
markup.
3.3.3
CheBI database
Chemical Entities of Biological Interest (ChEBI) [79] database was initiated in 2002 by the European
Bioinformatics Institute (EBI) with the objective of standardization of the biochemical terminology.
The database is focused on small molecular compounds and offers a wide range of information. The
information can be seen as a ’dictionary’, which contains ChEBI ID, ChEBI name, ChEBI ASCII
name, IUPAC name (International Union of Pure and Applied Chemistry), a definition, and synonyms.
The structure of the molecular compounds is shown as structural diagram, as IUPAC InChI (a nonproprietary identifier for chemical structures), InChIKey (25-character hashed version of the InChI),
and as SMILES (Simplified Molecular Input Line Entry System, which is a chemical line notation)
representation. Moreover, each entry includes information about mass, charge, formula, and the ChEBI
Ontology as well as links to other databases.
In addition ChEBI offers the possibility of downloading the database in several different file formats
[79][80].
31
CHAPTER 3. METHODS
3.3.4
GEO database
The Gene Expression Omnibus [81] database contains high-throughput gene expression data and genomic hybridization data. The objective was to create a robust and flexible database with simple
submission procedures and formats to cover a wide spectrum of high-throughput data [81][82]. Further
it should be intuitive to query, locate, review, and download data [82].
The database structure defines a primary and a secondary database [83]. The primary database includes ’submitter-supplied data’ and is divided into three components [81][82][83]: Platform (GPL),
Sample (GSM), and Series (GSE). The content of the primary database is very heterogeneous regarding content. A Platform contains a summary description of the array/sequencer and a data table
explaining the array template, if it is a array-based Platform [83]. The description of the material,
the experimental protocols, and a data table illustrating the abundance measurements of each feature
on the corresponding Platform is available in the Sample record [82][83]. A Series record combines a
set of similar Samples to be part of a study and describes the aim of the study and the design. Each
component has one specific perfix followed by an unique accession number [82].
The secondary database extracts the elements, which are shared over all elements of the primary
database for creating upper-level objects called GEO DataSet and Profile. A GEO DataSet includes
sequencing identity tracking information of each feature on the Platform, normalized expression measurements and the text, which describes the biological source and the experimental aim. Each GEO
DataSet contains a collection of related Sample records and can be identified by a unique accession
number with the prefix ’GDS’. A GEO Profile is derived from the GEO DataSet and contains the
expression measurements of one gene across all samples [82][83].
The GEO records and raw data files are freely available in different file formats and it is also possible
to submit files [82].
3.3.5
Human Metabolome Database
The Human Metabolome Database [84] is currently the biggest organism-specific metabolomics database [85]. The database offers built in tools for searching, viewing, and extracting metabolites, biofluid
concentrations, enzymes, genes, metabolic concentration data of mass spectra (MS), and Nuclear Magnetic resonance (NMR) metabolic analysis and diseases [84]. HMDB is updated every half a year, as
content and coverage are growing rapidly [85]. The compounds are classified into chemical ’kingdoms’,
’classes’ and ’families’ [85]. The result of the general search is a summary table called MetaboCard
containing 90 different fields of information. These information fields can be divided into chemical or
physico-chemical data and biological or biomedical data. Moreover, MetaboCard offers hyperlinks to
many other databases, like KEGG, ChEBI or SwissProt [84].
32
CHAPTER 3. METHODS
3.3.6
KEGG databases
The Kyoto Encyclopedia of Genes and Genomes (KEGG) [86] project was initiated in 1995 and is a
database of the Japanese GenomeNet service. Since its beginning, the database has expanded significantly and includes nowadays three kinds of information: systems information, genomic information
and chemical information [87]. Fifteen databases are subordinated to these three categories (table 3.4)
where elements and their correspondence to a database prefixes followed by an identifier except KEGG
GENES and KEGG ENZYME, which derive their ID from RefSeq (genes) and ExplorEnz (enzyme)
[87].
Category
Database
Content
Prefix
Example
KEGG PATHWAY
Pathway map
map, ko, ec, rn, (org)
hsa04930
KEGG BRITE
Functional hierarchies
br, jp, ko, (org)
ko01003
KEGG MODULE
KEGG modules
M, org M
M00008
KEGG DISEASE
Human disease
H
H00004
KEGG DRUG
Drugs
D
D01441
KEGG ENVIRON
Crude drugs, ect.
E
E00048
KEGG ORTHOLOGY
KO groups
K
K04527
KEGG GENOME
KEGG organisms
T
T01001
KEGG GENE
Genes in high quality genomes
KEGG COMPOUND
Metabolites, small molecules
C
C00031
KEGG GLYCAN
Glycans
G
G00109
KEGG REACTION
Biochemical reactions
R
R00259
KEGG RPAIR
Reactant pairs
RP
RP04458
KEGG RCLASS
Reaction class
RC
RC00046
KEGG ENZYME
Enzyme nomenclature
Systems information
Genomic information
hsa:3634
Chemical information
ec:2.7.10.1
Table 3.4: Information about the KEGG databases. The table is adapted from [87].
The KEGG PATHWAY database includes manually drawn pathways maps of interactions and reaction
networks for metabolism, genetic information processing, environmental information processing, cellular
processing, organismal systems, human diseases, and drug development [87][88].
The KEGG COMPOUND database provides the chemical structure of metabolites and of other chemical
compounds [86] and the KEGG GLYCAN database offers the glycan structures [89].
33
CHAPTER 4. RESULTS
Chapter 4
Results
This chapter describes the results of the thesis. This includes the illustration of the tissue-specific
models, the gene expression analysis of adipose tissue and the analysis of reporter metabolites.
The first part of the results comprises the selection of an adequate toolbox for handling genome-scale
metabolic models in SBML file format.
The second part of the results encompasses the creation of one adipose and one liver tissue-specific
model. Therefore, adipose and liver tissue datasets were selected, preprocessed in R, and the GIMME
algorithm was applied to create the tissue-specific models. These models were compared with already
published adipocyte and liver tissue-specific models.
The third part describes the querying of the gene expression data about obese tissue. The differential
expression was applied on eight selected datasets using limma package in R. The top 10 differentially
expressed genes are shown in a table with the pathways they are involved in.
The fourth part is about the analysis of reporter metabolites. Therfore, the reporter features algorithm
was applied on each of the gene expression data in combination with each of the three genome-scale
metabolic models, EHMN, Recon 1, and adipocyte. The resulting reporter metabolites were compared
using four different approaches of comparisons: (i) between all 42 output files; (ii) between all output
files using one genome-scale metabolic model; (iii) using different models and the same expression data;
(iv) using two kinds of expression data of one dataset (treatment vs. control), but only one model.
Therefore, the metabolite IDs were manually added to the Recon 1 and adipocyte model. Moreover,
the pathways of the top 10 ranked genes were compared with the top 10 ranked reporter metabolites
to detect accordances.
34
CHAPTER 4. RESULTS
4.1
Comparison of toolboxes
During the thesis the following toolboxes were evaluated and after testing all toolboxes (COBRA,
TIGER and OptFlux toolboxes) the following conclusions were drawn.
The TIGER toolbox offers the functionality of converting COBRA models into TIGER models. However, this step is very time consuming for huge metabolic networks such as the human Recon 1 or
EHMN models and the toolbox was not capable of reading SBML files directly.
OptFlux is a standalone program, which offers a variety of functionalities. However, there are no implemented functions for mapping gene expression data onto genome-scale metabolic models. Moreover, it
does not provide as much possibilities for handling SBML models as for example the COBRA toolbox.
The COBRA toolbox offers a straight forward possibility for reading SBML files. It allows adapting
and extending of COBRA methods, and supports executing MATLAB methods to handle and analyse
models. Based on the described advantages the COBRA toolbox was chosen for further usage.
4.2
Creating tissue-specific models
This chapter is about the creation of one adipose and one liver tissue-specific model. Hence, gene
expression data were selected, preprocessed, and the panp method was applied to calculate the presence
and absence calls. The calculated genes with the corresponding presence and absence calls and the
human Recon 1 model were used as inputs for the GIMME algorithm. The last step describes the
comparison of the resulting tissue-specific models with already published adipocyte and liver models.
4.2.1
Expression data and preprocessing in R
The expression data sets were taken from the GEO database after searching for adipose tissue and
liver data. The data of the dataset GSE15773 (human adipocyte data) [14] and GSE15653 (human
liver data) [90] were read into R as CEL files. Only one kind of samples were selected from each series:
samples of insulin sensitive omental tissue (GSE15773) and samples of normal liver tissue (GSE15653).
GCRMA was applied as preprocessing method, because it is reported [68] to return the best results in
connection with the panp package for analysis of presence and absence calls.
4.2.2
Presence/Absence calls from Negative Probesets
The R command pa.calls, from the panp package was applied, using a tight cutoff of 0.01 and a loose
cutoff of 0.01 as inputs. The same cutoffs were chosen to get only presence and absence calls, because
35
CHAPTER 4. RESULTS
the construction of a tissue-specific model with presence, absence, and marginal calls is more advanced.
The result of the analysis is a presence or absence label for each gene of each sample. For further
usage, this output has to be simplified to receive one vector containing presence and absence calls for
each gene over all samples. Therefore, a strict option was chosen, which means that a gene was only
classified as present, if it was labelled present in all samples.
Genes in the human Recon 1 are specified using Entrez identifier. As genes in the expression datasets
were encoded with manufacturer identifiers of Affymetrix and needed to be converted to be comparable.
Therefore, the hgu133plus2.db and the hgu133a.db, offered in R, were used to rename the genes.
The two resulting vectors, one containing the genes and one containing the presence and absence calls,
were exported as CSV files.
4.2.3
Final model generation using the GIMME algorithm
The GIMME algorithm requires two inputs for creating a tissue-specific model:
- A genome-scale metabolic model
- A data structure containing the genes and the presence and absence calls
Human Recon 1 was used as genome-scale metabolic model. In order to create a precise tissue-specific
model, the lower and upper bounds as well as the objective function have to be adapted appropriately.
Using the explained input data the COBRA method createTissueSpecificModel reconstructs a
tissue-specific model by applying the GIMME algorithm, as described in chapter 3.1.1. Based on
the used input data models for adipocyte and liver were created.
The resulting liver model is shown in the following table 4.1 and is compared to already existing liver
models [37][44] and two complete human metabolic models [3][41].
Liver model
Liver model (Jerby et al.)
Liver model (Gille et al.)
Recon1
EHMN
Metabolites
2564
1360
1088
2766
6522
Reactions
2984
1826
2539
3743
6216
Table 4.1: Comparison of the tissue-specific liver models and the, as starting point used, human
metabolic models.
As it can be seen the previously published liver models differ in the number of reactions and metabolites
as they were reconstructed using different approaches. Jerby et al. [37] constructed the model by
applying their developed model building algorithm (MBA) [37] using the Recon 1 model as basis. The
model of Gille et al. [44] was developed manually using the Recon 1 and EHMN models as starting
point. It is obvious that in each model either the number metabolites or the number of reactions is
36
CHAPTER 4. RESULTS
reduced as not all pathways are active in liver tissue. The newly liver model still includes a large
number of metabolites and reactions, meaning that the model seems to be less precisely adapted to the
liver.
The following table 4.2 displays the differences between the newly created, the existing adipocyte model
of Bordbar et al. [48] and the human Recon 1 model.
Adipocyte model
Adipocyte (Bordbar et al.)
Recon1
Metabolites
2549
554
2766
Reactions
2830
649
3743
Table 4.2: Comparison of the tissue-specific adipocyte models and the, as starting point used, human
metabolic models.
The manually curated model of Bordbar et al. contains a significantly reduced number of reactions and
metabolites in comparison to the newly constructed.
4.3
Gene expression data analysis
Eight gene expression datasets were selected for mapping differential expressed genes onto genomescale metabolic models using the reporter features algorithm. The differential expression was carried
out using the limma package in R. The resulting top 10 differential expressed genes are illustrated with
the corresponding pathways, in which they are involved in.
4.3.1
Obtaining expression data
The first step was querying ArrayExpress database to search for gene expression data, of adipose tissue
datasets. Different datasets were chosen for close inspection and were downloaded for further usage.
Table 4.3 shows an overview of the selected datasets.
37
Title (GSE)
Platform (GPL)
Design of the study
Regulation of adipose tissue gene expression during different phases of a dietary weight loss
program and its relationship with insulin sensitivity:
Gene expression in adipose tissue during
weight loss (GSE11975) [14]
Agilent-012391 Whole Human
Genome Oligo Microarray G4112A
(GPL1708)
-
48 samples
Energy restriction phase (ER) with a 4-week very-low-calorie diet
Weight stabilization period (WS) composed of a 2-month low-calorie diet
3 to 4 months of a weight maintenance (DI) diet
Two samples per dietary phase, one before and one after the specific phase
Determination of gene expression signatures of omental and subcutaneous tissue samples:
Expression data from human adipose tissue
(GSE15773) [15]
Affymetrix Human Genome U133
Plus 2.0 Array
-
Biopsy samples of adipose tissue from twin pairs that had been followed for their discordance
for physical activity for 32 years:
38
Genome-wide analysis of adipose tissue gene
expression in twin-pairs discordant for physical
activity for over 30 years (GSE20536) [91]
Illumnia HumanWG-6 v3.0 expression beadchip
- 12 samples
- Two mono- and four dizygotic twins
- Paired sample per twin pair
Cross-sectional study design to compare subcutaneous adipose tissue gene expression profiles:
Differences in subcutaneous adipose tissue
gene expression between obese African Americans and Hispanic Youths (GSE23506) [92]
Illumnia HumanHT-12 v3.0 expression beadchip
- 36 sampels
- 17 African American
- 19 Hispanics
Fourty women followed a dietary protocol consisting of an 8-week low calorie diet (LCD) and
a 6-month weight maintenance phase:
Subcutaneous adipose tissue: comparison of
weight maintenance and weight regain following an 8-week low calorie diet (GSE24432) [93]
Agilent 014850 Whole Genome Microarray 4x44K G4112F
-
A total of 80 sampels
20 probands were classified as weight maintainers (WM)
20 probands were classified as weight regainers (WR)
2 paired samples per person, one before and one after LCD
CHAPTER 4. RESULTS
19 samples
5 insulin-resistant probands
5 insulin-sensitive probands
Insulin-resistant probands and insulin-sensitive probands were paired by their BodyMass-Index
- One sample of subcutaneous and omental adipose tissue of each proband
Title (GSE)
Platform (GPL)
Design of the study
Healthy lean and overweight subjects were submitted to a high fat diet during 56 days:
Characterization of the initial molecular events
of adipose tissue development and growth during overfeeding in humans (GSE28005) [13]
Hypoxia-induced modulation of gene expression in human adipocytes (GSE34007) [94]
Affymetrix Human Genome U133
Plus 2.0Array
Agilent 014850 Whole Genome Microarray 4x44K G4112F
- A total of 54 samples
- 18 probands
- 3 paired samples per proband, taken at day 0, day 14, day 56
Human adipocytes (Zen-bio cells) were incubated in hypoxic conditions (1% O2 ) for 24 h.
Control human adipocytes were incubated under normoxic conditions (21% O2 ):
- 8 samples
- 8 biological replicates for each experimental condition
Low calorie diet (LCD) containing 1200 kcal/day for three months. Following the weight
reduction phase for six month follow-up period:
Affymetrix Human Genome U133
Plus 2.0Array
- 26 samples
- 3 paired samples per proband, taken at baseline, after weight reduction, after weight
maintenance phase
39
Table 4.3: Description of the chosen datasets.
CHAPTER 4. RESULTS
Differential gene expression in adipose tissue
from obese human subjects during weight loss
and weight maintenance (GSE35411) [95]
CHAPTER 4. RESULTS
4.3.2
Calculation of differential expression
Each SOFT file of the selected datasets was loaded into R using the GEOquery package. Subsequently,
each dataset was converted to an ExpressionSet for further analysis with the limma package. Because
of the differences between the selected microarray experiments, the design and analysis steps of each
dataset will be explained ordered by the setup of the datasets.
The dataset GSE23506 is a cross-sectional study of a two-colour microarray experiment. A reference
pool is on one channel and the samples to compare are on the other channel. Hence, the dataset was
analysed as a comparison of two groups against a common reference, where a design matrix was created
as explained in chapter 3.2.4.
Dataset GSE34007 is a one-channel mircoarray, which requires a simple comparison of two groups.
The design matrix is the same as in the dataset GSE23506. The construction of the target file needed
to be done as described in chapter 3.2.4.
Most of the used datasets contain paired samples with measurements at different points in time, whereby
this experimental setup can be a simple or more advanced paired t-test.
The dataset GSE20536 contains data of twin pairs of a one-colour microarray. This implies that an
ordinary paired t-test, as described in chapter 3.2.4, is applied onto the data.
The dataset GSE11975 contains paired samples of a two-colour microarray and can be treated like an
one-colour microarray, because of the reference pool. The following comparisons were applied: before
vs. after energy restriction (ER), after energy restriction vs. after weight stabilization (WS) and before
dietary intervention vs. after weight stabilization (DI). All three comparisons are disconnected, but
have to be analysed within one experimental setup, leading to a more general version of a paired ttest. The design matrix is created by defining the pairs and afterwards adding columns of the specific
comparisons.
A contrast matrix is not needed for these paired samples as the comparisons are already included in
the design matrix.
The dataset GSE24432, a two-colour microarray experiment, includes four measurements against a
reference pool: (i) weight maintenance - before low calorie diet vs. after low calorie diet and (ii) weight
regainer - before low calorie diet vs. after low calorie diet. Based on the study setup the derived
comparisons are between WM samples and WR samples. The experimental design of limma follows
the same setup as the dataset GSE11975.
Dataset GSE15773 is a one-colour microarray and includes samples from insulin resistant and insulin
sensitive probands, whereby one probe was obtained from omental and one from subcutaneous tissue
40
CHAPTER 4. RESULTS
of each proband. The comparisons describe the differences of insulin resistant against insulin sensitive
subcutaneous tissue, and insulin resistant against insulin sensitive omental tissue.
The dataset GSE28005, a one-colour microarray, contains data from a time series where samples were
taken on day 0, day 14, and day 56. The comparisons are not predefines as in the other datasets
described previously. To obtain comparable results between gene expression patterns the following
comparisons were chosen: day 0 vs. day 14 and day 0 vs. day 56.
Dataset GSE35411 uses the same design as dataset GSE28005 and describes data of three points in
time without defined comparisons. Therefore, comparisons between baseline vs. after weight reduction
and baseline vs. weight maintenance were selected.
After applying the lmFit and ebayes methods, each gene of the ebayes output has a manufacturer
identifier. As human metabolic models use Entrez or RefSeq identifiers, manufacturer identifiers of the
genes are not applicable for further usage. Therefore, Entrez and RefSeq IDs were added for each gene
querying the Platform information of the SOFT file. However, for several genes the correct Entrez or
RefSeq ID could not be assigned as either no Entrez or RefSeq ID is available or multiple IDs match
to one gene name.
At least, the topTable function was used to summarize the results of the linear model, by creating a
list of the differentially expressed genes. Therefore, the following parameters were chosen:
- adjust.method : Bonferroni Hochberg
- sort.by: P-value
- number : Depending on the number of genes
Bonferroni Hochberg was selected as adjusting method to control the false discovery rate. This implies
that all genes below a threshold are selected as differentially expressed, and then controlled if the false
discovery rate is less than the threshold [67].
The created list of differentially expressed genes was modified to get one list containing Entrez identifiers
with the corresponding p-values and (log) fold change and one list in which RefSeq identifiers are used.
In both lists those genes are deleted, where no Entrez or RefSeq identifier was available and multiple
identifiers were written line-by-line.
The following table 4.4 shows the top 10 differentially expressed genes, the pathways, they are involved
in, and those reporter metabolites (highlighted in yellow) of the following tables (table 4.5 and 4.7),
which are also involved in these pathways.
41
CHAPTER 4. RESULTS
Ranking
EntrezID
RefSeqID
GeneName
1
8365
BC010926.1
HIST1H4H - histone cluster
P-value
2
55973
NM 001008406.1
3
1622
CR456956.1
2.38006191220476e-07
1, H4h
Pathway
Metabolites
hsa05034: Alcoholism
hsa05322: Systemic lupus erythematosus
BCAP29 - B-cell receptor-
6.2098504477583e-07
associated protein 29
M15887.1
DBI
-
inhibitor
diazepam
binding
(GABA
receptor
NM 001079862.1
modulator, acyl-CoA bind-
NM 001079863.1
ing protein)
6.27199638434536e-07
hsa03320: PPAR signaling pathway
NM 020548.5
4
55969
AF274936.1
C20orf24 - chromosome 20
BC001871.1
open reading frame 24
6.67608370868795e-07
BC004446.1
NM 018840.2
NM 199483.1
5
125
NM 000668.3
ADH1B - alcohol dehydro-
1.113560790908e-06
hsa00010: Glycolysis / Gluconeogenesis
genase 1B (class I), beta
C00111
C00236
polypeptide
hsa00071: Fatty acid metabolism
hsa00350: Tyrosine metabolism
C00122
C01036
C01179
hsa00830: Retinol metabolism
hsa00980: Metabolism of xenobiotics by cytochrome P450
hsa00982: Drug metabolism - cytochrome P450
hsa01100: Metabolic pathways
C00097
C00111
C00122
C00199
C00236
C00606
C01036
C01179
C03684
6
23086
7
55904
AY099469.1
EXPH5 - exophilin 5
1.1786953108756e-06
MLL5 - myeloid/lymphoid or
1.50324865193418e-06
hsa00310: Lysine degradation
1.90562620638919e-06
hsa00240: Pyrimidine metabolism
NM 015065.1
AY147037.1
NM 018682.3
mixed-lineage leukemia 5
NM 182931.2
8
1806
NM 000110.3
DPYD - dihydropyrimidine
dehydrogenase
hsa00410: beta-Alanine metabolism
hsa00770: Pantothenate and CoA biosynthesis
C00097
hsa00983: Drug metabolism - other enzymes
hsa01100: Metabolic pathways
C00097
C00111
C00122
C00199
C00236
C00606
C01036
C01179
C03684
9
9669
NM 015904.3
EIF5B - eukaryotic transla-
10
1290
BC043613.1
COL5A2 - collagen, type V,
BC086874.1
alpha 2
1.94232219398723e-06
hsa03013: RNA transport
2.29350303664699e-06
hsa04510: Focal adhesion
tion initiation factor 5B
hsa04512: ECM-receptor interaction
NM 000393.3
hsa04974: Protein digestion and absorption
C00097
hsa05146: Amoebiasis
Table 4.4: The top 10 differentially expressed genes from the GSE35411 (comparison ’baseline vs. after
weight reduction’) dataset with the corresponding pathways and metabolites.
42
CHAPTER 4. RESULTS
4.4
Reporter metabolites analysis
As mentioned in chapter 3.1.1 and 3.1.4, reporter metabolites are detected by mapping gene expression
data onto genome-scale metabolic models. Therefore, different kinds of gene expression data and three
different genome-scale metabolic models were used:
- Human Recon 1 model
- Human EHMN model
- Adipocyte model, a tissue-specific model of Bordbar et al. [48]
The results were used to compare the reporter metabolites in the following ways: (i) comparison of
the reporter metabolites between all 42 outputs; (ii) comparison the reporter metabolites between all
outputs using one genome-scale metabolic model; (iii) different models using the same expression data;
(iv) using two kinds expression data of one dataset (treatment vs. control), but only one model.
4.4.1
Reporter Features Analysis
The reporter features toolbox is a standalone toolbox for mapping gene expression data onto genomescale metabolic models to identify reporter metabolites. The toolbox requires three inputs as tabseparated text files:
- Analysed gene expression data
- Gene-reaction interaction of the metabolic model
- Reaction-metabolite interaction of the metabolic model
The gene expression data are the differentially expressed genes of the eight datasets.
The gene-reaction interaction file describes the relationship between the genes and the reactions within
a genome-scale metabolic model and needs to be created for all models, whereby the first column has
to contain the reactions and the second column has to include the genes. As the human Recon 1
and adipocyte genome-scale metabolic models include reactions as well as genes the COBRA method
findGenesFromReactions was used to create the text file. The EHMN model includes no information
about gene IDs, which were consequently taken from the supplementary EXCEL file, which includes
informations about the interactions of the EHMN model.
The reaction-metabolite interaction input file describes the metabolite-reaction interactions, which is
based on the Simple Interaction File (SIF) format [96]. This file can be easily created by loading the
SBML file of the model into Cytoscape and exporting it as SIF file. As the toolbox cannot read SIF
43
CHAPTER 4. RESULTS
files the columns have to be switched (first column has to contain the metabolites and the second the
reactions), to have a valid input for the reporter features toolbox.
The reporter metabolites are analysed for the differentially expressed genes of each comparison of
the dataset by applying it with each genome-scale metabolic model. For all calculations the default
parameters (kmax: 100, imax: 10000, reporter degree: 1, p-value cutoff: 0.05) were taken.
The reporter features toolbox returns three output files for each calculation: (i) one main output
file containing the ranking of the metabolites, (ii) one neighbour file containing all nodes and (iii) one
neighbour file providing metabolites that have some data value associated with them in the main output
file.
The first part of the main output file includes up- and down-regulated metabolites, in the second part
only up-, and in the third part only down-regulated metabolites. For the following analysis of the
metabolites these parts were split up into three single files.
4.4.2
Adapting human Recon 1
Comparing results of the reporter features toolbox between different methods requires a unified nomenclature of metabolites. As metabolite names differ in spelling, unique identifiers have to be used. The
only model containing unique identifiers is the EHMN model, where each metabolite is assigned to a
KEGG COMPOUND, KEGG GLYCAN ID, or an internal ID. The human Recon 1 model does not
initially include KEGG IDs of the metabolites in the SBML file, but IDs are available from the BiGG
database.
Consequently, KEGG COMPOUND and GLYCAN IDs from the BiGG database were validated and
added to the Recon 1. However, a lot of metabolites remained without or wrong IDs requiring a manual
search using their name or synonyms to get as much completely identified metabolites as possible. As
some metabolites were not included in the databases, KEGG COMPOUND, KEGG GLYCAN ID,
HMDB, or ChEBI, the internal ID of the EHMN model was added. Nevertheless, there are still
metabolites without an ID:
- IDs taken from the BiGG database: 1901
- IDs from the BiGG database, but edited: 35
- New added IDs: 218
- Number of metabolites without ID: 612
As the adipocyte model is a derivation of the human Recon 1 model, it includes the same metabolites,
except some specific metabolites describing the adipose tissue. The specific metabolites couldn’t be
found in any of these databases: KEGG COMPOUND, KEGG GLYCAN, HMDB, and ChEBI.
44
CHAPTER 4. RESULTS
Both models were also exported from MATLAB as SBML files including the KEGG COMPOUND and
GLYCAN IDs.
The KEGG COMPOUND and GLYCAN IDs and the complete metabolite names were added to all
those output files of the reporter features toolbox, which contain a ranking of metabolites.
4.4.3
Comparison of the reporter metabolites
The reporter metabolites analysis, applied on eight datasets in combination with EHMN, Recon 1,
and adipocyte genome-scale metabolic models, returns 42 output files, containing a list of reporter
metabolites. Hence, there are fourteen output files per model and the datasets contain one, two or
three points of measurement.
Therefore, several comparisons can be applied onto the output files of the reporter features algorithm:
- Comparison of the rank of the metabolites between all 42 outputs
- Comparison of the rank of the metabolites between all outputs of one genome-scale metabolic
model
- Comparison of the top 10 ranked metabolites between the three outputs for each model using the
same differential expression data
- Comparison of the top 10 ranked metabolites within one dataset, that contains two points of
measurements, using the same genome-scale metabolic model
All output files were loaded into a SQL database to apply the comparisons between the reporter
metabolites. The ranking is based on their p-values and they are compared based on the KEGG ID or
their name if the KEGG ID is not available.
The comparisons between the models and the expression data of one dataset lead to a large number of
tables. Therefore, one comparison is illustrated in the results and the other comparisons can be found
in the appendix.
Overlapping metabolites over all datasets
There are 186 reporter metabolites, compared by their KEGG ID, present in all 42 output files of the
reporter metabolites analysis. The ranking of these reporter metabolites differs significantly between
the output files.
45
CHAPTER 4. RESULTS
Overlapping metabolites in each model
The number of overlapping metabolites over all output files of one genome-scale metabolic model varies,
because of the different included reporter metabolites in an output file. In general, more overlapping
metabolites were found due to presence is needed only in one model and not in all three models.
- EHMN model: 1985 overlapping reporter metabolites
- Recon 1 model: 1252 overlapping reporter metabolites
- Adipocyte model: 298 overlapping reporter metabolites
However, the ranking of the overlapping reporter metabolites remains diverse for all used models.
Differential gene expression in adipose tissue from obese human subjects during weight
loss and weight maintenance (GSE35411) - Adipocyte model
Adipose tissue data were used as gene expression data and therefore the results of the adipocyte model
seem to be suitable as example files to present the results in this chapter. The first table (table 4.5)
illustrates the top 10 reporter metabolites in comparison to the rank of these metabolites using the
EHMN or Recon 1 model and the same expression data.
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Ranking
P-value
Recon 1 model
Ranking
P-value
Ranking
P-value
C01036
4-Maleylacetoacetate
1
0.005185
19
0.00297981
4
0.0023486
C00536
Inorganic triphosphate
2
0.00926387
29
0.00607262
7
0.00493835
C00111
Dihydroxyacetone phosphate
3
0.0115888
91
0.0249449
C00606
3-Sulfino-L-alanine
4
0.0138944
52
C00097
L-Cysteine
5
0.0164945
109
C01179
3-(4-Hydroxyphenyl)pyruvate
6
0.0220697
C00122
Fumarate
7
0.0227082
C00236
3-Phospho-D-glyceroyl phosphate
8
C00199
D-Ribulose 5-phosphate
9
C03684
6-Pyruvoyl-5,6,7,8-tetrahydropterin
10
Number of
pathways
2
1
51
0.0317427
10
0.0125323
12
0.0089768
3
0.0340762
102
0.0643997
11
1996
0.949555
640
0.539659
4
476
0.214497
87
0.053884
11
0.0243508
404
0.175407
323
0.242925
2
0.0249655
64
0.0161579
19
0.0141511
5
0.0267392
29
0.00607262
23
0.0159692
2
Table 4.5: Top 10 metabolites of the GSE35411 (comparison ’baseline vs. after weight reduction’)
dataset using the adipocyte model in comparison to the EHMN and Recon 1 model.
The ranking of the reporter metabolites differs between the outputs for each model based on the same
expression data. Moreover, it occurs that the top 10 reporter metabolites of one model cannot be found
in the ranking created with the other models and the same expression data.
Another kind of comparison is the detection of differences between the reporter metabolites using two
measurements within one dataset and the same model. This kind of comparison, displayed in the
following table 4.6, illustrates the divergence between the two points of measurement. Whereas the
comparison between the models aims to have similar results of the ranking of reporter metabolites.
46
CHAPTER 4. RESULTS
base-loss
KEGG ID
base-main
Metabolite name
Ranking
P-value
Ranking
189
P-value
C01036
4-Maleylacetoacetate
1
0.005185
C00536
Inorganic triphosphate
2
0.00926387
C00111
Dihydroxyacetone phosphate
3
0.0115888
144
0.465898
C00606
3-Sulfino-L-alanine
4
0.0138944
234
0.752851
C00097
L-Cysteine
5
0.0164945
116
0.364113
C01179
3-(4-Hydroxyphenyl)pyruvate
6
0.0220697
198
0.654706
C00122
Fumarate
7
0.0227082
108
0.331964
C00236
3-Phospho-D-glyceroyl phosphate
8
0.0243508
18
C00199
D-Ribulose 5-phosphate
9
0.0249655
209
0.682199
C03684
6-Pyruvoyl-5,6,7,8-tetrahydropterin
10
0.0267392
59
0.165882
15
0.620879
0.0329617
0.0445467
Table 4.6: GSE35411 dataset using the adipocyte model illustrating the comparison of the top 10
metabolites between baseline vs. after weight reduction and baseline vs. weight maintenance.
The table 4.4 shows the pathways, in which the top 10 reporter metabolites are involved in. The top
10 reporter metabolites are involved in 35 and the top 10 genes in 22 pathways. Hence, there are five
pathways (coloured yellow in table 4.4 and 4.7) in which top 10 ranked reporter metabolites as well as
top 10 ranked genes are involved.
hsa01100 Metabolic pathways
cpd:C00097
L-Cysteine
cpd:C00111
Glycerone phosphate
cpd:C00122
Fumarate
cpd:C00199
D-Ribulose 5-phosphate
cpd:C00236
3-Phospho-D-glyceroyl phosphate
cpd:C00606
3-Sulfino-L-alanine
cpd:C01036
4-Maleylacetoacetate
cpd:C01179
3-(4-Hydroxyphenyl)pyruvate
cpd:C03684
6-Pyruvoyltetrahydropterin
hsa00030 Pentose phosphate pathway
cpd:C00199
D-Ribulose 5-phosphate
hsa00051 Fructose and mannose metabolism
cpd:C00111
Glycerone phosphate
hsa00052 Galactose metabolism
cpd:C00111
Glycerone phosphate
hsa00130 Ubiquinone and other terpenoid-quinone biosynthesis
cpd:C01179
3-(4-Hydroxyphenyl)pyruvate
hsa00350 Tyrosine metabolism
cpd:C00122
Fumarate
cpd:C01036
4-Maleylacetoacetate
cpd:C01179
3-(4-Hydroxyphenyl)pyruvate
hsa00250 Alanine, aspartate and glutamate metabolism
cpd:C00122
Fumarate
hsa00260 Glycine, serine and threonine metabolism
hsa00010 Glycolysis / Gluconeogenesis
cpd:C00111
Glycerone phosphate
cpd:C00236
3-Phospho-D-glyceroyl phosphate
cpd:C00097
L-Cysteine
hsa00330 Arginine and proline metabolism
cpd:C00122
Fumarate
hsa00040 Pentose and glucuronate interconversions
cpd:C00111
Glycerone phosphate
cpd:C00199
D-Ribulose 5-phosphate
hsa00360 Phenylalanine metabolism
cpd:C00122
hsa00190 Oxidative phosphorylation
cpd:C00122
Fumarate
cpd:C00536
Triphosphate
Fumarate
hsa00400 Phenylalanine, tyrosine and tryptophan biosynthesis
cpd:C01179
3-(4-Hydroxyphenyl)pyruvate
hsa00480 Glutathione metabolism
hsa00270 Cysteine and methionine metabolism
cpd:C00097
L-Cysteine
cpd:C00606
3-Sulfino-L-alanine
cpd:C00097
L-Cysteine
hsa00561 Glycerolipid metabolism
cpd:C00111
Glycerone phosphate
hsa00430 Taurine and hypotaurine metabolism
cpd:C00097
L-Cysteine
cpd:C00606
3-Sulfino-L-alanine
hsa00562 Inositol phosphate metabolism
cpd:C00111
hsa00760 Nicotinate and nicotinamide metabolism
cpd:C00111
Glycerone phosphate
cpd:C00122
Fumarate
Glycerone phosphate
hsa00564 Glycerophospholipid metabolism
cpd:C00111
Glycerone phosphate
hsa00620 Pyruvate metabolism
hsa00020 Citrate cycle (TCA cycle)
cpd:C00122
cpd:C00111
Fumarate
47
Glycerone phosphate
CHAPTER 4. RESULTS
hsa00650 Butanoate metabolism
cpd:C00122
hsa00920 Sulfur metabolism
Fumarate
cpd:C00097
hsa00730 Thiamine metabolism
cpd:C00097
hsa00970 Aminoacyl-tRNA biosynthesis
L-Cysteine
cpd:C00097
L-Cysteine
cpd:C00097
L-Cysteine
hsa00740 Riboflavin metabolism
cpd:C00199
hsa04122 Sulfur relay system
D-Ribulose 5-phosphate
hsa00750 Vitamin B6 metabolism
cpd:C00199
hsa04974 Protein digestion and absorption
D-Ribulose 5-phosphate
cpd:C00097
L-Cysteine
cpd:C00122
Fumarate
hsa00770 Pantothenate and CoA biosynthesis
cpd:C00097
L-Cysteine
cpd:C03684
6-Pyruvoyltetrahydropterin
L-Cysteine
hsa05200 Pathways in cancer
hsa00790 Folate biosynthesis
hsa05211 Renal cell carcinoma
cpd:C00122
Fumarate
Table 4.7: The pathways in which the top 10 reporter metabolites are involved in.
48
CHAPTER 5. DISCUSSION
Chapter 5
Discussion
The objective of the thesis is mapping gene expression data onto genome-scale metabolic models. For
this purpose, two tissue-specific models, one liver and one adipocyte model, were created and the
detection of reporter metabolites was carried out.
First of all an adequate toolbox for handling genome-scale metabolic models, in SBML file format, had
to be chosen.
The first step was the creation of one liver and one adipocyte tissue-specific model. These newly created
models were also compared with already published genome-scale metabolic models of the liver and the
adipocyte.
The second step deals with the collection and preprocessing of gene expression data from obese tissues.
The differentially expressed genes were used for further analysis with the reporter features algorithm
and the top 10 differentially expressed genes were shown in a table with their corresponding pathways.
The next step was the application of the reporter features algorithm to calculate the reporter metabolites
using the differentially expressed genes of each dataset in combiantion with each of the three genomescale metabolic models, adipocyte, EHMN, and Recon 1.
To enable a good comparison the KEGG COMPOUND and GLYCAN IDs were added manually to the
Recon 1 and adipocyte model, hence the EHMN model includes them already.
As final step the results of the reporter features algorithm were compared as follows: (i) between all 42
output files; (ii) between all output files using one genome-scale metabolic model; (iii) different models
and the same expression data; (iv) two kinds of expression data of one dataset (treatment vs. control),
but only one model. Moreover, the pathways of the top 10 ranked genes were compared with the top
10 ranked reporter metabolites to detect accordances.
Most existing toolboxes for handling genome-scale metabolic models have a limited amount of implemented functionalities. Therefore, it was necessary to use different toolboxes to obtain the desired
result. This implies that data had to be converted into several distinct file formats, which can induce
mistakes more easily. Furthermore, the majority of these conversions had to be carried out manually
49
CHAPTER 5. DISCUSSION
by implementing own methods or adapting existing algorithms.
The COBRA toolbox was selected for this thesis, because it offered a huge variety of implemented
functionalities. During the work, some COBRA methods proved to be better adapted to the need of
using the Recon 1 model, but not for other models used in the thesis. Moreover, implementation of
some basic functionalities showed to be partial, only covering specific parts of a standard used by the
toolbox. For example, the import-function was only able to read the SBML files from the Recon 1 and
derived tissue-specific models, but not from other models like the EHMN model (although all models
were valid SBML files). The design as a MATLAB toolbox made the necessary adaption, to process all
SBML files with the COBRA toolbox, feasible.
The creation of a tissue-specific model was a challenging task, because many different parameters had
to be chosen adequately and no existing guideline was available.
The first step was the preprocessing of the chosen raw data (from CEL files) in R. The GCRMA
approach was used, because it is reported to return the best results in combination with the selected
panp method [68] to calculate presence and absence calls.
For the panp method the cutoff values had to be chosen and it had to be decided, if it is the aim to
have only present and absent calls, or present, absent, and marginal calls as input data for the GIMME
algorithm. After calculating the calls for each gene in each sample file, a matrix was returned. However,
the GIMME algorithm required a vector as input. Therefore, it had to be decided which conditions
should be fulfilled that a gene is present, absent, or marginal. For instance, such a condition could
include that a gene is only present if it is labelled as present in all samples or if it is labelled as present
with a given percentage of the samples. It should be noted, that a higher number of used samples leads
to the fact, that the presence of a gene over all samples is less likely.
Another difficulty represented the genome-scale metabolic model. For getting a preferably good adapted
tissue-specific model, the lower and upper bounds as well as the objective function had to be chosen
correctly, because these parameters influence the final model. The resulting tissue-specific model is
often modified to adapt it more precisely to the aiming tissue.
Both newly reconstructed tissue-specific models, adipocyte and liver model, seemed to lack of accuracy,
because metabolites and reactions are fewer in number in already existing models. Beside the influence
of the input data, the inaccuracy occurred because of the needed precise definition of upper and lower
bounds and the objective function. This step requires a huge knowledge about the genome-scale metabolic model, which is used as basis, and about the aiming tissue-specific model. For this reason, there
is still room for further improvements.
Eight adipose tissue datasets were selected and the already preprocessed files (SOFT files) were used
for differential gene expression. This step was carried out in R using the limma package. Therefore, a
design and a contrast matrix had to be constructed for each dataset. The descriptions of the different
linear models for the analysis of differential expressed genes are available in [67][70] and cover a huge
50
CHAPTER 5. DISCUSSION
amount of different microarray designs. The example files had more basic constructions of design
and contrast matrix, whereas the used datasets included paired samples with different measurements
in time. Therefore, the construction of the matrices was more advanced, but the description was
illustrated comprehensible.
The online version of the reporter toolbox was not working for large network files. Hence, a standalone
version of the reporter features tool was requested. Using the standalone version, each of the gene
expression data was combined with each genome-scale metabolic model (adipocyte, EHMN, Recon 1)
to calculate the according reporter metabolites.
The step of adding KEGG COMPOUND and GLYCAN IDs to the Recon 1 and adipocyte model was
done by a manual search of the metabolites in different online databases. The main difficulty was that
the spelling varies for each metabolite and that each metabolite name has a lot of synonyms. In most
cases, not all the information about a metabolite could be found using only one database. Hence,
three databases were used: KEGG, ChEBI, and HMDB. The completeness of the databases varies,
meaning that a metabolite is not included in all databases and the information about a metabolite
differs between the databases. Most metabolites could be found in HMDB or ChEBI database, because
they contain the most synonyms for one metabolite. The corresponding KEGG ID could be detected
by a offered hyperlink or by trying synonyms as input for the search in the KEGG database. The
search of glycans was even more difficult, because the glycans are mainly in the ChEBI and KEGG
GLYCAN database. Furthermore, their spelling was completely different between the Recon 1 model
and the KEGG GLYCAN database.
All in all, this was a very time consuming step which could have been repeated several times to match
as many metabolites as possible to a KEGG ID. A aggravating circumstance was the heterogeneity and
inconsistency between the three databases, but they are updated regularly and contain more and more
metabolites.
After adding the KEGG IDs to the ranked metabolites of the output files of the reporter features
toolbox, a comparison between the models was carried out by using the KEGG ID and if no KEGG ID
was available, the metabolite names. Thereby, the first conclusion was that the number of overlapping
reporter metabolites is much higher using only fourteen output files of one model, despite comparing
all 42 output files of all models.
Differences in the ranking of the reporter metabolites occurred in all comparisons. One comparison
had the aim of detecting the differences between the reporter metabolites using the top 10 ranked
metabolites within one dataset, that contains two points of measurements. The remaining comparisons
had the objective of finding the similarities between the ranking of the reporter metabolites.
One reason for the difference in the ranking was that each output file of the reporter features toolbox
contained an individual list of reporter metabolites. This list was influenced by the expression data and
51
CHAPTER 5. DISCUSSION
genome-scale metabolic models. The three genome-scale metabolic models have a different number of
metabolites, reactions and genes. Therefore, also the resulting ranked lists contained a different number
of reporter metabolites.
Furthermore, a reason is the incompleteness between the IDs of the Recon 1 and adipocyte models
as well as the internal IDs of the EHMN model. This caused problems of matching the metabolites
between the models and led to unidentified matches influencing the accuracy and completeness of the
results.
Despite the differences between the used models and the incomplete IDs, similar rankings of reporter
metabolites could be observed in the comparison between the three models with the same underlying
differential expressed genes. These comparisons were done for all 42 output files of the reporter features
toolbox and are illustrated in the appendix. Moreover, also identical pathways could be detected
by comparing the pathways between the top 10 ranked reporter metabolites and the top 10 ranked
differentially expressed genes.
To conclude, genome-scale metabolic models include a lot of biological information relating the interconnections of reactions, metabolites and genes. Therefore, they can be used as a powerful tool to study
the human metabolism as well as metabolic diseases or the influence of drugs. Increasing attention is
paid to the construction of tissue-specific models to get more precise metabolic models of the human
key tissues and cells [97].
52
LIST OF FIGURES
List of Figures
2.1
Illustration of the different approaches for visualizing and analysing metabolic networks
taken from [22]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
In this figure a bipartite graph is represented. The circles represent the metabolite
vertices and the rectangles the reaction vertices. The figure is redrawn from [24]. . . . .
2.3
5
7
Example of a minimal model of glycolysis to illustrate the kinetic approach [22]. A is the
reaction scheme and shows a graphical presentation of a minimal model of glycolysis. It
shows that one unit of glucose (G) is converted by reactions into two units of pyruvate
(P ). B shows the stoichiometric matrix N , which includes the information of the metabolites in their rows and the information about the reactions in the columns. Gx , Px ,
and Glx represent external metabolites, which are not in the stoichiometric matrix. C
represents the reaction list of the model and D the dynamic mass-balance equation or
system of differential equations [21][22]. . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4
8
The figure depicts two similar reaction networks and the corresponding elementary flux
modes. It is taken from [2]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.5
The figure illustrates the five steps of flux balance analysis [25]. . . . . . . . . . . . . . .
11
2.6
The four major applications of global human metabolic models [42]. . . . . . . . . . . .
15
2.7
The reconstruction of a metabolic reaction network is done with data from literature and
gene-protein-reaction (GPR) relationships from experimental data. This information is
converted into a stoichiometric matrix for the following simulation step, which is an
iterative process. Thereby flux balance analysis is used to calculate the steady-state
fluxes through the network using constraints. The results are analysed and validated
with the help of different methods. Subsequently these outcomes are date of, for example,
published literature or online databases, which might be used as new input for metabolic
network reconstructions [50]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
3.1
Overview of the COBRA toolbox functionalities [5]. . . . . . . . . . . . . . . . . . . . .
18
3.2
Illustration of the step-wise procedure of the reporter metabolites algorithm [52]. . . . .
20
3.3
The figure illustrates the conversion of a COBRA model into a TIGER model [54]. . . .
21
53
LIST OF FIGURES
3.4
Illustration of the step-wise procedure of the Reporter Features algorithm [52]. . . . . .
24
3.5
The expression density for classifying genes a present, absent or marginal [68]. . . . . . .
28
54
LIST OF TABLES
List of Tables
2.1
Comparison of the human metabolic models. . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Target file of the comparison of two groups against a common reference. The table is
14
adapted from [67]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.2
Target file of the comparison of two groups. The table is adapted from [67]. . . . . . . .
30
3.3
Target file of the comparison of paired samples. The table is adapted from [67]. . . . . .
30
3.4
Information about the KEGG databases. The table is adapted from [87].
33
4.1
Comparison of the tissue-specific liver models and the, as starting point used, human
metabolic models.
4.2
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
Comparison of the tissue-specific adipocyte models and the, as starting point used, human metabolic models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
4.3
Description of the chosen datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
4.4
The top 10 differentially expressed genes from the GSE35411 (comparison ’baseline vs.
after weight reduction’) dataset with the corresponding pathways and metabolites. . . .
4.5
Top 10 metabolites of the GSE35411 (comparison ’baseline vs. after weight reduction’)
dataset using the adipocyte model in comparison to the EHMN and Recon 1 model. . .
4.6
42
46
GSE35411 dataset using the adipocyte model illustrating the comparison of the top
10 metabolites between baseline vs. after weight reduction and baseline vs. weight
4.7
maintenance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
The pathways in which the top 10 reporter metabolites are involved in. . . . . . . . . .
48
55
BIBLIOGRAPHY
Bibliography
[1] Nielsen J. Transcriptional control of metabolic fluxes. Mol Syst Biol. 2011 Mar;7:478.
[2] Klipp E, Herwig R, Kowlad A, Wierling C, Lehrach H. Systems Biology in Practice. Concepts,
Implementation and Application. Wiley-VCH; 2005.
[3] Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, Vo TD, et al. Global reconstruction of the
human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci USA. 2007
Feb;104(6):1777–1782.
[4] Edelman LB, Eddy JA, Price ND. In silico models of cancer. Wiley Interdiscip Rev Syst Biol
Med. 2010;2(4):438–459.
[5] Schellenberger J, Que R, Fleming RMT, Thiele I, Orth JD, Feist AM, et al. Quantitative prediction
of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc. 2011
Sep;6(9):1290–1307.
[6] Palsson B, Zengler K. The challenges of integrating multi-omic data sets. Nat Chem Biol. 2010
Nov;6(11):787–789.
[7] Förster J, Famili I, Fu P, Palsson B, Nielsen J. Genome-scale reconstruction of the Saccharomyces
cerevisiae metabolic network. Genome Res. 2003 Feb;13(2):244–253.
[8] Ma H, Sorokin A, Mazein A, Selkov A, Selkov E, Demin O, et al. The Edinburgh human metabolic
network reconstruction and its functional analysis. Mol Syst Biol. 2007;3:135.
[9] Li L, Zhou X, Ching WK, Wang P. Predicting enzyme targets for cancer drugs by profiling human
metabolic reactions in NCI-60 cell lines. BMC Bioinformatics. 2010;11:501.
[10] Folger O, Jerby L, Frezza C, Gottlieb E, Ruppin E, Shlomi T. Predicting selective drug targets in
cancer through metabolic networks. Mol Syst Biol. 2011;7:501.
[11] Obesity and overweight. World Health Organization; 2012. Available from: http://www.who.int/
mediacentre/factsheets/fs311/en/index.html [cited 2012 Jul 25].
56
BIBLIOGRAPHY
[12] Obesity: preventing and managing the global epidemic. Report of a WHO consultation. World
Health Organ Tech Rep Ser. 2000;894:i–xii, 1–253.
[13] Alligier M, Meugnier E, Debard C, Lambert-Porcheron S, Chanseaume E, Sothier M, et al. Subcutaneous adipose tissue remodeling during the initial phase of weight gain induced by overfeeding
in humans. J Clin Endocrinol Metab. 2012 Feb;97(2):E183–E192.
[14] Capel F, Klimcáková E, Viguerie N, Roussel B, Vı́tková M, Kováciková M, et al. Macrophages
and adipocytes in human obesity: adipose tissue gene expression and insulin sensitivity during
calorie restriction and weight stabilization. Diabetes. 2009 Jul;58(7):1558–1567.
[15] Hardy OT, Perugini RA, Nicoloro SM, Gallagher-Dorval K, Puri V, Straubhaar J, et al. Body
mass index-independent inflammation in omental adipose tissue associated with insulin resistance
in morbid obesity. Surg Obes Relat Dis. 2011;7(1):60–67.
[16] Schilling CH, Letscher D, Palsson BO. Theory for the systemic definition of metabolic pathways
and their use in interpreting metabolic function from a pathway-oriented perspective. J Theor
Biol. 2000 Apr;203(3):229–248.
[17] Schilling CH, Schuster S, Palsson BO, Heinrich R. Metabolic pathway analysis: basic concepts
and scientific applications in the post-genomic era. Biotechnol Prog. 1999;15(3):296–303.
[18] Garrett R, Grisham C. Biochemistry. Physical Science David Harris; 2005.
[19] Seager S, Slabaugh M. Organic and Biochemistry for Today. 7th ed. Hartford C.; 2011.
[20] Deisboeck T, Kresh J, editors. Complex Systems Science in Biomedicine. Topics in Biomedical
Engineering; 2006. Springer Verlag.
[21] Steuer R. Computational approaches to the topology, stability and dynamics of metabolic networks.
Phytochemistry. 2007;68(16-18):2139–2151.
[22] Steuer R, Junker BH. Computational Models of Metabolism: Stability and Regulation in Metabolic
Networks. In: Rice SA, editor. Advances in Chemical Physics. vol. 142. Hoboken, NJ, USA: John
Wiley & Sons, Inc.; 2008. .
[23] Emmert-Streib F, Dehmer M. Networks for systems biology: conceptual connection of data and
function. IET Syst Biol. 2011 May;5(3):185–207.
[24] Dehmer M, Emmert-Streib F, Graber A, Salvador A, editors. Applied Statistics for Network
Biology: Methods in Systems Biology. Wiley-VCH; 2011. To appear.
[25] Orth JD, Thiele I, Palsson B. What is flux balance analysis? Nat Biotechnol. 2010 Mar;28(3):245–
248.
57
BIBLIOGRAPHY
[26] Raman K, Chandra N. Flux balance analysis of biological systems: applications and challenges.
Brief Bioinform. 2009 Jul;10(4):435–449.
[27] Mrabet Y, Semmar N. Mathematical methods to analysis of topology, functional variability and
evolution of metabolic systems based on different decomposition concepts. Curr Drug Metab. 2010
May;11(4):315–341.
[28] Price ND, Reed JL, Palsson B. Genome-scale models of microbial cells: evaluating the consequences
of constraints. Nat Rev Microbiol. 2004 Nov;2(11):886–897.
[29] Pinchuk GE, Hill EA, Geydebrekht OV, De Ingeniis J, Zhang X, Osterman A, et al. Constraintbased model of Shewanella oneidensis MR-1 metabolism: a tool for data analysis and hypothesis
generation. PLoS Comput Biol. 2010 Jun;6(6):e1000822.
[30] Westerhoff HV, Palsson BO. The evolution of molecular biology into systems biology. Nat Biotechnol. 2004 Oct;22(10):1249–1252.
[31] Schellenberger J, Park JO, Conrad TM, Palsson B. BiGG: a Biochemical Genetic and Genomic
knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics. 2010;11:213.
[32] Oh YK, Joyce AR, Palsson BO. Constraint-based Genome-Scale In Silico Models for Systems
Biology. Asia Pacific Biotech News. 2006;10(3):123–136.
[33] Reed JL, Vo TD, Schilling CH, Palsson BO. An expanded genome-scale model of Escherichia coli
K-12 (iJR904 GSM/GPR). Genome Biol. 2003;4(9):R54.
[34] Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, et al. A genome-scale
metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and
thermodynamic information. Mol Syst Biol. 2007;3:121.
[35] Thiele I, Palsson B. A protocol for generating a high-quality genome-scale metabolic reconstruction.
Nat Protoc. 2010;5(1):93–121.
[36] Famili I, Forster J, Nielsen J, Palsson BO. Saccharomyces cerevisiae phenotypes can be predicted
by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc Natl
Acad Sci USA. 2003 Nov;100(23):13134–13139.
[37] Jerby L, Shlomi T, Ruppin E. Computational reconstruction of tissue-specific metabolic models:
application to human liver metabolism. Mol Syst Biol. 2010 Sep;6:401.
[38] Rolfsson O, Palsson B, Thiele I. The human metabolic reconstruction Recon 1 directs hypotheses
of novel human metabolic functions. BMC Syst Biol. 2011;5:155.
[39] Shlomi T, Cabili MN, Herrgård MJ, Palsson B, Ruppin E. Network-based prediction of human
tissue-specific metabolism. Nat Biotechnol. 2008 Sep;26(9):1003–1010.
58
BIBLIOGRAPHY
[40] Shlomi T, Benyamini T, Gottlieb E, Sharan R, Ruppin E. Genome-scale metabolic modeling
elucidates the role of proliferative adaptation in causing the Warburg effect. PLoS Comput Biol.
2011 Mar;7(3):e1002018.
[41] Hao T, Ma HW, Zhao XM, Goryanin I. Compartmentalization of the Edinburgh Human Metabolic
Network. BMC Bioinformatics. 2010;11:393.
[42] Bordbar A, Palsson BO. Using the reconstructed genome-scale human metabolic network to study
physiology and pathology. J Intern Med. 2012 Feb;271(2):131–141.
[43] Bordbar A, Lewis NE, Schellenberger J, Palsson B, Jamshidi N. Insight into human alveolar
macrophage and M. tuberculosis interactions via metabolic reconstructions. Mol Syst Biol. 2010
Oct;6:422.
[44] Gille C, Bölling C, Hoppe A, Bulik S, Hoffmann S, Hübner K, et al. HepatoNet1: a comprehensive
metabolic reconstruction of the human hepatocyte for the analysis of liver physiology. Mol Syst
Biol. 2010 Sep;6:411.
[45] Chang RL, Xie L, Xie L, Bourne PE, Palsson B. Drug off-target effects predicted using structural
analysis in the context of a metabolic network model. PLoS Comput Biol. 2010;6(9):e1000938.
[46] Lewis NE, Schramm G, Bordbar A, Schellenberger J, Andersen MP, Cheng JK, et al. Large-scale
in silico modeling of metabolic interactions between cell types in the human brain. Nat Biotechnol.
2010 Dec;28(12):1279–1285.
[47] Bordbar A, Jamshidi N, Palsson BO. iAB-RBC-283: A proteomically derived knowledge-base
of erythrocyte metabolism that can be used to simulate its physiological and patho-physiological
states. BMC Syst Biol. 2011;5:110.
[48] Bordbar A, Feist AM, Usaite-Black R, Woodcock J, Palsson BO, Famili I. A multi-tissue type
genome-scale metabolic network for analysis of whole-body systems physiology. BMC Syst Biol.
2011;5:180.
[49] Hucka M, Bergmann FT, Hoops S, Keating SM, Sahle S, Smith LP, et al. The systems biology
markup language (SBML): a medium for representation and exchange of biochemical network
models.; 2010. Nature Proceedings.
[50] Gianchandani EP, Chavali AK, Papin JA. The application of flux balance analysis in systems
biology. Wiley Interdiscip Rev Syst Biol Med. 2010;2(3):372–382.
[51] Becker SA, Palsson BO. Context-specific metabolic networks are consistent with experiments.
PLoS Comput Biol. 2008 May;4(5):e1000082.
59
BIBLIOGRAPHY
[52] Patil KR, Nielsen J. Uncovering transcriptional regulation of metabolism by using metabolic
network topology. Proc Natl Acad Sci USA. 2005 Feb;102(8):2685–2689.
[53] Herold H, Lurz B, Wohlrab J. Grundlagen der Informatik. Pearson Studium; 2006.
[54] Jensen PA, Lutz KA, Papin JA. TIGER: Toolbox for integrating genome-scale metabolic models,
expression data, and transcriptional regulatory networks. BMC Syst Biol. 2011;5:147.
[55] Klamt S, Saez-Rodriguez J, Gilles ED. Structural and functional analysis of cellular networks with
CellNetAnalyzer. BMC Syst Biol. 2007;1:2.
[56] Cvijovic M, Olivares-Hernández R, Agren R, Dahr N, Vongsangnak W, Nookaew I, et al. BioMet Toolbox: genome-wide analysis of metabolism. Nucleic Acids Res. 2010 Jul;38(Web Server
issue):W144–W149.
[57] Jensen PA, Papin JA. Functional integration of a metabolic network model and expression data
without arbitrary thresholding. Bioinformatics. 2011 Feb;27(4):541–547.
[58] Rocha I, Maia P, Evangelista P, Vilaça P, Soares S, Pinto JP, et al. OptFlux: an open-source
software platform for in silico metabolic engineering. BMC Syst Biol. 2010;4:45.
[59] Kamp A, Schuster S. Metatool 5.0: fast and flexible elementary modes analysis. Bioinformatics.
2006 Aug;22(15):1930–1931.
[60] Segrè D, Vitkup D, Church GM. Analysis of optimality in natural and perturbed metabolic
networks. Proc Natl Acad Sci USA. 2002 Nov;99(23):15112–15117.
[61] Shlomi T, Berkman O, Ruppin E. Regulatory on/off minimization of metabolic flux changes after
genetic perturbations. Proc Natl Acad Sci USA. 2005 May;102(21):7695–7700.
[62] Burgard AP, Pharkya P, Maranas CD. Optknock: a bilevel programming framework for identifying
gene knockout strategies for microbial strain optimization. Biotechnol Bioeng. 2003 Dec;84(6):647–
657.
[63] Patil KR, Rocha I, Förster J, Nielsen J. Evolutionary programming as a platform for in silico
metabolic engineering. BMC Bioinformatics. 2005;6:308.
[64] Terzer M, Stelling J. Large-scale computation of elementary flux modes with bit pattern trees.
Bioinformatics. 2008 Oct;24(19):2229–2235.
[65] Funahashi A, Morohashi M, Kitano H. CellDesigner: a process diagram editor for gene-regulatory
and biochemical networks. 2003;1:159–162+.
[66] Oliveira AP, Patil KR, Nielsen J. Architecture of transcriptional regulatory circuits is knitted over
the topology of bio-molecular interaction networks. BMC Syst Biol. 2008;2:17.
60
BIBLIOGRAPHY
[67] Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S, editors. Bioinformatics and Computational
Biology Solutions Using R and Bioconductor (Statistics for Biology and Health). New York:
Springer Science+Business Media; 2005.
[68] Warren P, Taylor D, Martini PGV, Jackson J, Bienkowska J. panp: Presence-Absence Calls from
Negative Strand Matching Probesets;. Under review.
[69] Irizarry RA, Wu Z, Jaffee HA. Comparison of Affymetrix GeneChip expression measures. Bioinformatics. 2006 Apr;22(7):789–794.
[70] Hahne F, Huber W, Gentleman R, Falcon S. Bioconductor Case Studies. 1st ed. Springer Publishing Company, Incorporated; 2008.
[71] Wu Z, Irizarry RA, Gentleman R, Murillo FM, Spencer F. A Model Based Background Adjustment
for Oligonucleotide Expression Arrays. Johns Hopkins University, Dept of Biostatistics Working
Papers Working Paper 1. 2004;.
[72] Wu Z, Irizarry RA. Stochastic models inspired by hybridization theory for short oligonucleotide
arrays. J Comput Biol. 2005;12(6):882–893.
[73] Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, et al. Exploration,
normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics.
2003 Apr;4(2):249–264.
[74] Naef F, Magnasco MO. Solving the riddle of the bright mismatches: Labeling and effective binding
in oligonucleotide arrays. Phys Rev E. 2003 Jul;68:011906.
[75] Sean D, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and
BioConductor. Bioinformatics. 2007 Jul;23(14):1846–1847.
[76] Smyth GK, Ritchie M, Thorne N, Wettenhall J, Shi W. limma: Linear Models for Microarray
Data User Guide; 2010.
[77] Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, et al.
ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids
Res. 2003 Jan;31(1):68–71.
[78] Parkinson H, Sarkans U, Shojatalab M, Abeygunawardena N, Contrino S, Coulson R, et al.
ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids
Res. 2005 Jan;33(Database issue):D553–D555.
[79] Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, et al. ChEBI:
a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 2008
Jan;36(Database issue):D344–D350.
61
BIBLIOGRAPHY
[80] Matos P, Alcántara R, Dekker A, Ennis M, Hastings J, Haug K, et al. Chemical Entities of
Biological Interest: an update. Nucleic Acids Res. 2010 Jan;38(Database issue):D249–D254.
[81] Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002 Jan;30(1):207–210.
[82] Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, et al. NCBI GEO:
archive for high-throughput functional genomic data. Nucleic Acids Res. 2009 Jan;37(Database
issue):D885–D890.
[83] Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, et al. NCBI GEO: archive for
functional genomics data sets–10 years on. Nucleic Acids Res. 2011 Jan;39(Database issue):D1005–
D1010.
[84] Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, et al. HMDB: the Human Metabolome
Database. Nucleic Acids Res. 2007 Jan;35(Database issue):D521–D526.
[85] Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, et al. HMDB: a knowledgebase
for the human metabolome. Nucleic Acids Res. 2009 Jan;37(Database issue):D603–D610.
[86] Kanehisa M, Goto S, Kawashima S, Nakaya A. The KEGG databases at GenomeNet. Nucleic
Acids Res. 2002 Jan;30(1):42–46.
[87] Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation
of large-scale molecular data sets. Nucleic Acids Res. 2012 Jan;40(Database issue):D109–D114.
[88] Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, et al. From genomics
to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006 Jan;34(Database
issue):D354–D357.
[89] Kanehisa M. Representation and analysis of molecular networks involving diseases and drugs.
Genome Inform. 2009 Oct;23(1):212–213.
[90] Pihlajamäki J, Boes T, Kim EY, Dearie F, Kim BW, Schroeder J, et al. Thyroid hormone-related
regulation of gene expression in human fatty liver. J Clin Endocrinol Metab. 2009 Sep;94(9):3521–
3529.
[91] Leskinen T, Rinnankoski-Tuikka R, Rintala M, Seppänen-Laakso T, Pöllänen E, Alen M, et al.
Differences in muscle and adipose tissue gene expression and cardio-metabolic risk factors in the
members of physical activity discordant twin pairs. PLoS One. 2010;5(9).
[92] Lê KA, Mahurkar S, Alderete TL, Hasson RE, Adam TC, Kim JS, et al. Subcutaneous adipose
tissue macrophage infiltration is associated with hepatic and visceral fat deposition, hyperinsulinemia, and stimulation of NF-kappaB stress pathway. Diabetes. 2011 Nov;60(11):2802–2809.
62
BIBLIOGRAPHY
[93] Mutch DM, Pers TH, Temanni MR, Pelloux V, Marquez-Quiñones A, Holst C, et al. A distinct
adipose tissue gene expression response to caloric restriction predicts 6-mo weight maintenance in
obese subjects. Am J Clin Nutr. 2011 Dec;94(6):1399–1409.
[94] Mazzatti D, Lim FL, O’Hara A, Wood IS, Trayhurn P. A microarray analysis of the hypoxiainduced modulation of gene expression in human adipocytes.
Arch Physiol Biochem. 2012
Jul;118(3):112–120.
[95] Johansson LE, Danielsson AP, Parikh H, Klintenberg M, Norström F, Groop L, et al. Differential gene expression in adipose tissue from obese human subjects during weight loss and weight
maintenance. Am J Clin Nutr. 2012 Jul;96(1):196–207.
[96] Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, et al. Integration of biological
networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366–2382.
[97] Mardinoglu A, Nielsen J.
Systems medicine and metabolic modelling.
Feb;271(2):142–154.
63
J Intern Med. 2012
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Appendix A
Results of all selected datasets
The appendix includes the results of the thesis (as presented in chapter 4.3 and 4.4) for each of the
eight selected datasets, which are described in chapter 4.3.1. This chapter is structured according to
the eight datasets and each of these subchapters include the following results:
The first part illustrates the results of the differential expression using the limma package in R. The
top 10 differentially expressed genes, the pathways in which they are involved in, as well as those top
10 reporter metabolites, which are included in the pathways, are shown in a table.
The second part and third part include the comparisons of the results of the reporter metabolites
analysis: (i) using different models and the same expression data and (ii) using two kinds of expression
data of one dataset and one model. Therefore, the reporter features algorithm was applied on each
of the gene expression data in combination with each of the three genome-scale metabolic models,
adipocyte, EHMN and Recon 1.
64
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.1
Gene expression in adipose tissue during weight loss
(GSE11975)
Regulation of adipose tissue gene expression during different phases of a dietary weight loss program
and its relationship with insulin sensitivity [14]:
- Energy restriction phase (ER) with a 4-week very-low-calorie diet
- Weight stabilization period (WS) composed of a 2-month low-calorie diet
- 3 to 4 months of a weight maintenance (DI) diet
- Two samples per dietary phase, one before and one after the specific phase
The following comparisons were applied for the calculation of the differentially expressed genes:
(i) before vs. after energy restriction (ER), (ii) after energy restriction vs. after weight stabilization
(WS), and (iii) before dietary intervention vs. after weight stabilization (DI).
A.1.1
Differentially expressed genes
The following tables show the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Before vs. after energy restriction (ER)
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
2331
fibromodulin
2.89100446064137e-10
aldolase C,
1.27029911564754e-08
NM 002023
2
230
NM 005165
hsa00010: Glycolysis/
C00118
fructose-
Gluconeogenesis
C00236
bisphosphate
hsa00030: Pentose phos-
C00118
C00577
C00118
C00577
hsa01100: Metabolic path-
C00026
C00010
C00003
ways
C00118
C00122
C00004
C00122
C00149
C00122
C00149
C00311
C00149
C00236
C00577
C00266
C05272
C01944
C00422
phate pathway
hsa00051: Fructose and
mannose metabolism
C05271
C05276
3
85329
NM 033101
lectin, galactosidebinding,
1.56113172754231e-08
soluble,
12
4
26292
NM 012333
5
4311
NM 007289
c-myc binding
7.31809251898118e-08
protein
membrane metallo-
7.68345348715221e-08
endopeptidase
hsa04614: Reninangiotensinsystem
hsa04640: Hematopoietic
cell lineage
hsa04974: Protein digestion
and absorption
hsa05010: Alzheimer’s disease
65
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
6
5918
NM 002888
retinoic acid re-
8.69645640007105e-08
ceptor responder 1
(tazarotene
induced)
7
4015
lysyl oxidase
1.01198150445751e-07
selenium binding
1.20641993480657e-07
NM 002317
8
8991
NM 003944
9
10
protein 1
200942
kelch domain
NM 173546
containing 8B
6678
secreted protein,
NM 003118
acidic, cysteine-
1.59842794800541e-07
1.72515267323231e-07
rich (osteonectin)
Table A.1: The top 10 differentially expressed genes from the comparison ’before vs. after energy
restriction’ with the corresponding pathways and reporter metabolites.
After energy restriction vs. after weight stabilization (WS)
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
5918
NM 002888
retinoic acid re- 1
1.96312275773292e-12
ceptor responder
(tazarotene
induced)
2
2512
NM 000146
ferritin, light
2.58942973403245e-11
polypeptide
hsa00860: Porphyrin and
chlorophyll metabolism
hsa04978: Mineral absorpt-
C00124
tion
3
2495
ferritin, heavy
NM 002032
polypeptide 1
5.77157843379722e-10
hsa00860: Porphyrin and
chlorophyll metabolism
hsa04978: Mineral absorp-
C00124
tion
4
2331
fibromodulin
8.83712364280302e-10
transferrin
1.22261082954918e-09
NM 002023
5
7018
NM 001063
6
7
53940
C00124
tion
ferritin, heavy
NM 031894
polypeptide-like 17
85329
lectin, galactoside-
NM 03310
hsa04978: Mineral absorp-
binding,
2.89825261692573e-09
3.40382326546594e-09
soluble,
12
8
6720
sterol regulatory
NM 001005291
element binding
5.69684281972593e-09
hsa04910: Insulin signaling
pathway
transcription
factor 1
9
8991
NM 003944
10
3693
selenium binding
5.72487175899984e-09
protein 1
integrin, beta 5
5.86078392028941e-09
NM 002213
hsa04145: Phagosome
hsa04510: Focal adhesion
hsa04512: ECM-receptor
interaction
hsa04810: Regulation of
actin cytoskeleton
hsa05410: Hypertrophic
cardiomyopathy (HCM)
hsa05412: Arrhythmogenic
right ventricular cardiomyopathy (ARVC)
hsa05414: Dilated cardiomyopathy
Table A.2: The top 10 differentially expressed genes from the comparison ’after energy restriction vs.
after weight stabilization’ with the corresponding pathways and reporter metabolites.
66
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Before dietary intervention vs. after weight stabilization (DI)
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
54968
NM 001040613
2
3
transmembrane
950
scavenger receptor
NM 005506
class B, member 2
2512
NM 000146
2.51811309478267e-08
protein 70
ferritin, light
6.8712935066448e-08
hsa04142: Lysosome
7.50862402819666e-08
hsa00860: Porphyrin and
polypeptide
C00430
chlorophyll metabolism
hsa04978: Mineral absorp-
C00080
tion
4
2495
ferritin, heavy
NM 002032
polypeptide 1
9.93681572742043e-08
hsa00860: Porphyrin and
C00430
chlorophyll metabolism
hsa04978: Mineral absorp-
C00080
tion
5
4794
nuclear factor of
NM 004556
kappa light poly-
1.36610708335739e-07
hsa04660: T cell receptor
signaling pathway
peptide gene en-
hsa04662: B cell receptor
hancer in B-cells
signaling pathway
inhibitor, epsilon
hsa04722: Neurotrophin
signaling pathway
hsa04920: Adipocytokine
C00083
signaling pathway
C00162
hsa05169: Epstein-Barr
virus infection
6
5476
cathepsin A
3.40254011668576e-07
NM 000308
hsa04142: Lysosome
hsa04614: Reninangiotensin system
7
11346
synaptopodin
5.36612405808613e-07
retinoic acid
6.47596378555396e-07
NM 007286
8
10742
NM 021785
9
3200
induced 2
homeobox A3
6.813484e-07
obscurin-like 1
7.315764e-07
NM 153631
10
23363
NM 001173431
Table A.3: The top 10 differentially expressed genes from the comparison ’before dietary intervention
vs. after weight stabilization’ with the corresponding pathways and reporter metabolites.
67
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.1.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
Before vs. after energy restriction (ER)
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
eicosadienoyl-CoA (C20:2CoA, n-6)
1
0.000815611
NA
NA
NA
NA
NA
NA
NA
0.0000793772
53
0.0240723
octadecadienoyl-CoA (C18:2CoA, n-6)
2
0.00178503
NA
C00149
L-Malate
3
0.00269335
6
C00026
2-Oxoglutarate
4
0.00282039
1069
0.520135
443
0.382618
C00236
3-Phospho-D-glyceroyl phosphate
5
0.0034244
1736
0.810936
193
0.139036
1-Acyl-sn-glycerol 3-phosphate, adipocyte
6
0.00742544
NA
NA
NA
NA
octadecatrienoyl-CoA (C18:3CoA, n-6)
7
0.00803552
NA
NA
NA
NA
stearidonyl coenzyme A (C18:4CoA, n-3)
7
0.00803552
NA
NA
NA
NA
Glyceraldehyde 3-phosphate
8
0.00938127
1084
0.529239
16
0.00349265
docosenoyl-CoA (C22:1CoA, n-9)
9
0.00959812
NA
NA
NA
NA
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
9
0.00959812
515
0.243862
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
9
0.00959812
56
C00122
Fumarate
10
0.00963716
592
C00118
1010
0.820441
0.00651582
6
0.000563763
0.291419
2
0.000196622
Table A.4: The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using
the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
48
P-value
Octanoyl-CoA
1
0.0000464231
C02249
Arachidonyl-CoA
2
0.00005543
NA
NA
726
C00577
D-Glyceraldehyde
3
0.0000747738
NA
NA
99
0.0592681
C05271
trans-Hex-2-enoyl-CoA
4
0.000075209
NA
NA
NA
NA
C05276
trans-Oct-2-enoyl-CoA
4
0.000075209
NA
NA
NA
NA
C00010
CoA
5
0.000079096
12
0.0142852
431
0.362262
C00149
(S)-Malate
6
0.0000793772
32
0.0647696
53
C00122
Fumarate
7
0.000255348
10
0.00963716
CE5312
6(R)-hydroxy-tetradeca-2E,8Z-dienoate
8
0.000293455
NA
NA
NA
NA
CE5324
6(S)-hydroxy-tetradeca-2E,8Z-dienoate
8
0.000293455
NA
NA
NA
NA
CE5315
8(R)-hydroxy-hexadeca-2E,6E,10Z-trienoate
8
0.000293455
NA
NA
NA
NA
CE5327
8(S)-hydroxy-hexadeca-2E,6E,10Z-trienoate
8
0.000293455
NA
NA
NA
NA
CE0852
palmitoleoyl-CoA
9
0.000319094
NA
NA
NA
NA
C00311
Isocitrate
0.0258153
39
0.017695
0.00035769
19
52
P-value
C01944
10
0.111422
Recon 1 model
Rank
0.0240429
0.640881
0.0240723
2
0.000196622
Table A.5: The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using
the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
C00004
Nicotinamide adenine dinucleotide - reduced
1
0.000188955
126
C00122
Fumarate
2
0.000196622
10
C00003
Nicotinamide adenine dinucleotide
3
0.000201739
126
C00149
L-Malate
4
0.000416562
32
tetracosahexaenoyl coenzyme A
5
0.000425598
NA
C00510
Octadecenoyl-CoA (n-C18:1CoA)
6
0.000563763
63
C16218
trans-Octadec-2-enoyl-CoA
6
0.000563763
vaccenyl coenzyme A
6
triacylglycerol (homo sapiens)
P-value
0.420891
EHMN model
Rank
99
P-value
0.0164511
0.00963716
592
0.291419
0.420891
140
0.0324125
0.0647696
6
0.0000793772
NA
NA
NA
0.194446
56
0.00651582
NA
NA
NA
NA
0.000563763
NA
NA
NA
NA
7
0.000585207
NA
NA
1159
R total Coenzyme A
8
0.00117564
NA
NA
NA
NA
C01181
4-Trimethylammoniobutanoate
9
0.00138316
NA
NA
757
0.37199
C00266
Glycolaldehyde
10
0.00142146
NA
NA
926
0.462381
C00422
0.565755
Table A.6: The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’ using
the Recon 1 model in comparison to the adipocyte and EHMN model.
68
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
After energy restriction vs. after weight stabilization (WS)
Adipocyte model
KEGG ID
C00010
Metabolite name
EHMN model
Rank
P-value
Coenzyme A
1
0.00043073
Rank
P-value
634
0.289467
Recon 1 model
Rank
P-value
418
0.355848
eicosadienoyl-CoA (C20:2CoA, n-6)
2
0.000529598
NA
NA
NA
NA
C00083
Malonyl-CoA
3
0.000983564
199
0.0494886
541
0.466681
C00100
Propanoyl-CoA (C3:0CoA)
4
0.00127582
7
1145
0.924431
octadecadienoyl-CoA (C18:2CoA, n-6)
5
0.00257569
NA
NA
NA
NA
average fatty-acyl CoA, human adipocyte
6
0.00324056
NA
NA
NA
NA
1-Acyl-sn-glycerol 3-phosphate, adipocyte
7
0.00343025
NA
NA
NA
NA
Acetate
8
0.00537747
616
0.27988
199
0.145767
docosenoyl-CoA (C22:1CoA, n-9)
9
0.00762508
NA
NA
NA
NA
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
9
0.00762508
1014
0.493451
709
0.591968
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
9
0.00762508
460
0.187829
4
C00163
Propionate
10
0.00966763
2066
0.965631
13
C00033
0.000292561
0.000833699
0.00543786
Table A.7: The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight
stabilization’ using the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
C00010
CoA
1
0.0000442993
1
C00116
Glycerol
2
0.0000739748
219
C00022
Pyruvate
3
0.0000846602
40
C00124
D-Galactose
4
0.000107323
NA
NA
C00001
H2O
5
0.000130733
198
0.716951
CE2432
trans-2-cis,cis-5,8-tetradecatrienoyl-CoA
6
0.000231587
NA
NA
C00100
Propanoyl-CoA
7
0.000292561
78
0.240532
C00630
2-Methylpropanoyl-CoA
8
0.000310565
42
C00149
(S)-Malate
9
0.000548462
25
C00256
(R)-Lactate
10
0.000578654
NA
Recon 1 model
Rank
P-value
0.00043073
418
0.355848
0.774998
972
0.792624
0.0811011
987
0.806428
36
0.0144961
1115
NA
0.902857
NA
1145
0.924431
0.0843622
281
0.225949
0.0340194
33
NA
0.0127139
547
0.471218
Table A.8: The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight
stabilization’ using the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
C00422
triacylglycerol (homo sapiens)
1
0.000434381
NA
C00665
D-Fructose 2,6-bisphosphate
2
0.000459701
60
C00412
Stearoyl-CoA (n-C18:0CoA)
3
0.000772334
106
C00510
Octadecenoyl-CoA (n-C18:1CoA)
4
0.000833699
C16218
trans-Octadec-2-enoyl-CoA
4
vaccenyl coenzyme A
P-value
NA
EHMN model
Rank
P-value
446
0.179351
0.138369
1657
0.774421
0.370419
525
0.224664
106
0.370419
460
0.187829
0.000833699
NA
NA
NA
NA
4
0.000833699
NA
NA
NA
NA
tetracosahexaenoyl coenzyme A
5
0.000983564
NA
NA
NA
NA
C00681
lysophosphatidic acid (homo sapiens)
6
0.00173502
NA
NA
158
0.0345668
C00581
Guanidinoacetate
7
0.00185154
NA
NA
22
C01149
4-Trimethylammoniobutanal
8
0.00303461
NA
NA
584
0.261257
C00671
(S)-3-Methyl-2-oxopentanoate
9
0.00357221
34
0.0714451
1038
0.501129
C00141
3-Methyl-2-oxobutanoate
9
0.00357221
34
0.0714451
30
0.00348937
C00233
4-Methyl-2-oxopentanoate
9
0.00357221
34
0.0714451
30
0.00348937
C00122
Fumarate
10
0.00409946
16
0.0223838
345
0.00237982
0.126616
Table A.9: The top 10 reporter metabolites of the comparison ’after energy restriction vs. after weight
stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN model.
69
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Before dietary intervention vs. after weight stabilization (DI)
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C00236
3-Phospho-D-glyceroyl phosphate
1
0.000241325
721
0.361765
149
0.0980095
C00365
dUMP
2
0.000882425
C00631
D-Glycerate 2-phosphate
3
0.00169653
2068
156
0.0520995
0.966784
254
14
0.185858
C00033
Acetate
4
0.00896914
2009
0.943246
273
0.206262
C00186
L-Lactate
4
0.00896914
1610
0.777097
1012
0.843079
C00364
dTMP
5
0.00989853
1278
0.640701
24
C01342
Ammonium
6
0.0155426
1423
0.693694
635
C11455
4,4-dimethylcholesta-8,14,24-trienol
7
0.0167523
191
0.068952
32
C00080
H+
8
0.0304529
1900
0.906596
667
0.555862
C00197
3-Phospho-D-glycerate
9
0.0427859
326
0.146362
973
0.803022
C00021
S-Adenosyl-L-homocysteine
10
0.0432681
671
0.340687
436
0.346229
0.0100781
0.0153494
0.52276
0.0223548
Table A.10: The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after
weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
CE1918
5-hydroxytryptophol
1
0.000124565
NA
NA
CE2252
3-oxooctadecanoyl-CoA
2
0.000475843
NA
C00010
CoA
2
0.000475843
97
C00083
Malonyl-CoA
2
0.000475843
C00154
Palmitoyl-CoA
2
C01003
Myosin light chain
C03875
Recon 1 model
Rank
P-value
NA
NA
NA
NA
NA
0.371356
255
0.185966
126
0.444287
344
0.269801
0.000475843
151
0.543014
407
0.323938
3
0.000720087
NA
NA
NA
NA
Myosin light chain phosphate
3
0.000720087
NA
NA
NA
NA
C00249
Hexadecanoic acid
4
0.000742868
111
0.399068
288
0.216668
C06412
Palmitoyl-protein
4
0.000742868
NA
NA
NA
NA
C00001
H2O
5
0.00112052
244
0.860932
686
0.570013
C05889
Undecaprenyl-diphospho-N-acetylmuramoyl-
6
0.00119732
NA
NA
NA
NA
6
0.00119732
NA
NA
NA
NA
6
0.00119732
NA
NA
NA
NA
6
0.00119732
NA
NA
NA
NA
6
0.00119732
NA
NA
NA
NA
6
0.00119732
NA
NA
NA
NA
7
0.00129634
NA
NA
785
0.661467
(N-acetylglucosamine)-L
C05890
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05893
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05894
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05898
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05899
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C01290
beta-D-Galactosyl-1,4-beta-Dglucosylceramide
C01582
Galactose
7
0.00129634
NA
NA
NA
NA
C13856
2-Arachidonoylglycerol
8
0.00141498
NA
NA
NA
NA
C00162
Fatty acid
9
0.00202817
NA
NA
NA
NA
CE3481
1-lyso-2-arachidonoyl-phosphatidate
10
0.00226329
NA
NA
NA
NA
C02960
Ceramide 1-phosphate
10
0.00226329
NA
NA
718
0.596105
C00836
Sphinganine
10
0.00226329
175
0.6127
282
0.214305
C00319
Sphingosine
10
0.00226329
NA
NA
756
0.637543
Table A.11: The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after
weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon 1 model.
70
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
C00214
Thymidine
1
0.00121664
21
C00430
5-Amino-4-oxopentanoate
2
0.00207101
NA
C00365
dUMP
3
0.00225587
2
C00740
D-Serine
4
0.00282043
NA
C00671
(S)-3-Methyl-2-oxopentanoate
5
0.00311572
32
C00141
3-Methyl-2-oxobutanoate
5
0.00311572
C00233
4-Methyl-2-oxopentanoate
5
C00153
Nicotinamide
C00108
P-value
0.076193
NA
0.000882425
NA
EHMN model
Rank
560
12
156
P-value
0.279837
0.00260582
0.0520995
1092
0.548522
0.106318
227
0.087367
32
0.106318
18
0.00382342
0.00311572
32
0.106318
18
0.00382342
6
0.00403591
NA
NA
972
Anthranilate
7
0.00444249
NA
NA
81
0.0224737
C05653
N-Formylanthranilate
7
0.00444249
NA
NA
81
0.0224737
C00427
Prostaglandin H2
8
0.0049088
NA
NA
775
C00147
Adenine
9
0.00650209
NA
NA
36
0.00763469
C00262
Hypoxanthine
9
0.00650209
NA
NA
36
0.00763469
C00294
Inosine
9
0.00650209
NA
NA
36
0.00763469
C00672
2-Deoxy-D-ribose 1-phosphate
10
0.00752422
NA
NA
64
0.0158303
0.483973
0.385616
Table A.12: The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after
weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN model.
A.1.3
Comparison between expression data
The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model.
Adipocyte
DI
KEGG ID
Metabolite name
ER
Rank
P-value
Rank
C00236
3-Phospho-D-glyceroyl phosphate
1
0.000241325
5
C00365
dUMP
2
0.000882425
C00631
D-Glycerate 2-phosphate
3
0.00169653
C00033
Acetate
4
C00186
L-Lactate
C00364
WS
P-value
Rank
P-value
0.0034244
252
0.868703
140
0.484077
138
0.497858
218
0.797504
65
0.172218
0.00896914
84
0.276577
8
4
0.00896914
172
0.641447
44
dTMP
5
0.00989853
245
0.864606
147
0.525764
C01342
Ammonium
6
0.0155426
284
0.96675
52
0.110458
C11455
4,4-dimethylcholesta-8,14,24-trienol
7
0.0167523
184
0.688295
22
0.0284681
C00080
H+
8
0.0304529
274
0.928399
282
0.938686
C00197
3-Phospho-D-glycerate
9
0.0427859
130
0.438251
59
0.137453
C00021
S-Adenosyl-L-homocysteine
10
0.0432681
161
0.582947
136
0.495402
0.00537747
0.0857472
Table A.13: The comparison of the top 10 reporter metabolites between ’before dietary intervention vs.
after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction
vs. after weight stabilization (WS)’ based on the adipocyte model.
71
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
ER
KEGG ID
Metabolite name
DI
Rank
P-value
Rank
WS
P-value
eicosadienoyl-CoA (C20:2CoA, n-6)
1
0.000815611
192
0.670054
Rank
P-value
2
0.000529598
0.00257569
octadecadienoyl-CoA (C18:2CoA, n-6)
2
0.00178503
183
0.650181
5
C00149
L-Malate
3
0.00269335
282
0.975743
25
C00026
2-Oxoglutarate
4
0.00282039
121
0.427831
188
0.673654
C00236
3-Phospho-D-glyceroyl phosphate
5
0.0034244
0.000241325
252
0.868703
1-Acyl-sn-glycerol 3-phosphate, adipocyte
6
0.00742544
112
0.399639
7
octadecatrienoyl-CoA (C18:3CoA, n-6)
7
0.00803552
141
0.494175
23
0.0299012
stearidonyl coenzyme A (C18:4CoA, n-3)
7
0.00803552
141
0.494175
23
0.0299012
Glyceraldehyde 3-phosphate
8
0.00938127
12
docosenoyl-CoA (C22:1CoA, n-9)
9
0.00959812
170
0.605169
9
0.00762508
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
9
0.00959812
170
0.605169
9
0.00762508
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
9
0.00959812
200
0.714021
106
C00122
Fumarate
10
0.00963716
261
0.908253
16
C00118
1
0.0507808
169
0.0340194
0.00343025
0.595346
0.370419
0.0223838
Table A.14: The comparison of the top 10 reporter metabolites between ’before vs. after energy
restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy
restriction vs. after weight stabilization (WS)’ based on the adipocyte model.
WS
KEGG ID
C00010
Metabolite name
DI
Rank
P-value
Coenzyme A
1
0.00043073
Rank
ER
P-value
97
0.371356
Rank
12
P-value
0.0142852
eicosadienoyl-CoA (C20:2CoA, n-6)
2
0.000529598
192
0.670054
1
C00083
Malonyl-CoA
3
0.000983564
126
0.444287
26
0.048913
C00100
Propanoyl-CoA (C3:0CoA)
4
0.00127582
235
0.82635
91
0.303512
octadecadienoyl-CoA (C18:2CoA, n-6)
5
0.00257569
183
0.650181
2
average fatty-acyl CoA, human adipocyte
6
0.00324056
125
0.443888
15
1-Acyl-sn-glycerol 3-phosphate, adipocyte
7
0.00343025
112
0.399639
6
Acetate
8
0.00537747
136
0.472094
84
docosenoyl-CoA (C22:1CoA, n-9)
9
0.00762508
170
0.605169
9
0.00959812
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
9
0.00762508
170
0.605169
9
0.00959812
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
9
0.00762508
200
0.714021
63
0.194446
C00163
Propionate
10
0.00966763
230
0.802412
109
0.384904
C00033
0.000815611
0.00178503
0.0164075
0.00742544
0.276577
Table A.15: The comparison of the top 10 reporter metabolites between ’after energy restriction vs.
after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and
’before vs. after energy restriction (ER)’ based on the adipocyte model.
72
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN
DI
KEGG ID
Metabolite name
ER
Rank
P-value
Rank
WS
P-value
0.573729
Rank
P-value
CE1918
5-hydroxytryptophol
1
0.000124565
1174
88
0.0126586
CE2252
3-oxooctadecanoyl-CoA
2
0.000475843
251
C00010
CoA
2
0.000475843
121
0.0861474
257
0.0832488
0.0241291
634
C00083
Malonyl-CoA
2
0.000475843
144
0.289467
0.0333697
199
C00154
Palmitoyl-CoA
2
0.000475843
29
0.0494886
0.00234422
102
C01003
Myosin light chain
3
0.000720087
1507
0.717333
1930
0.015661
0.90093
C03875
Myosin light chain phosphate
3
0.000720087
1507
0.717333
1930
0.90093
C00249
Hexadecanoic acid
4
0.000742868
516
0.244235
183
C06412
Palmitoyl-protein
4
0.000742868
1280
0.619202
16
C00001
H2O
5
0.00112052
1991
0.928843
1215
C05889
Undecaprenyl-diphospho-N-acetylmuramoyl-
6
0.00119732
1129
0.552433
19
0.00180818
6
0.00119732
1129
0.552433
19
0.00180818
6
0.00119732
1129
0.552433
19
0.00180818
6
0.00119732
1129
0.552433
19
0.00180818
6
0.00119732
1129
0.552433
19
0.00180818
6
0.00119732
1129
0.552433
19
0.00180818
7
0.00129634
1460
0.697697
1240
0.0410749
0.00146671
0.580405
(N-acetylglucosamine)-L
C05890
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05893
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05894
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05898
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C05899
Undecaprenyl-diphospho-N-acetylmuramoyl(N-acetylglucosamine)-L
C01290
beta-D-Galactosyl-1,4-beta-D-
0.591959
glucosylceramide
C01582
Galactose
7
0.00129634
1473
0.702801
55
C13856
2-Arachidonoylglycerol
8
0.00141498
1644
0.779317
1482
0.00724312
0.696163
C00162
Fatty acid
9
0.00202817
1947
0.910556
827
0.400344
CE3481
1-lyso-2-arachidonoyl-phosphatidate
10
0.00226329
599
0.293282
246
0.0780639
C02960
Ceramide 1-phosphate
10
0.00226329
1644
0.779317
1482
0.696163
C00836
Sphinganine
10
0.00226329
2096
0.989158
1908
0.886791
C00319
Sphingosine
10
0.00226329
1741
0.813051
1793
0.830949
Table A.16: The comparison of the top 10 reporter metabolites between ’before dietary intervention vs.
after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction
vs. after weight stabilization (WS)’ based on the EHMN model.
ER
KEGG ID
Metabolite name
DI
Rank
P-value
Rank
1569
WS
P-value
P-value
C01944
Octanoyl-CoA
1
0.0000464231
C02249
Arachidonyl-CoA
2
0.00005543
C00577
D-Glyceraldehyde
3
0.0000747738
C05271
trans-Hex-2-enoyl-CoA
4
0.000075209
1546
0.747177
69
0.0101871
C05276
trans-Oct-2-enoyl-CoA
4
0.000075209
1546
0.747177
69
0.0101871
C00010
CoA
5
0.000079096
962
0.478043
634
C00149
(S)-Malate
6
0.0000793772
2084
0.978406
9
C00122
Fumarate
7
0.000255348
1860
0.886873
345
CE5312
6(R)-hydroxy-tetradeca-2E,8Z-dienoate
8
0.000293455
1185
0.592985
38
0.00465786
CE5324
6(S)-hydroxy-tetradeca-2E,8Z-dienoate
8
0.000293455
1185
0.592985
38
0.00465786
CE5315
8(R)-hydroxy-hexadeca-2E,6E,10Z-trienoate
8
0.000293455
1185
0.592985
38
0.00465786
CE5327
8(S)-hydroxy-hexadeca-2E,6E,10Z-trienoate
8
0.000293455
1185
0.592985
38
0.00465786
CE0852
palmitoleoyl-CoA
9
0.000319094
91
C00311
Isocitrate
10
0.00035769
258
16
1591
0.75662
Rank
1144
0.548302
0.101004
789
0.379997
0.00316358
773
0.367531
0.289467
0.000548462
0.126616
0.0260006
387
0.141423
0.765921
144
0.0286262
Table A.17: The comparison of the top 10 reporter metabolites between ’before vs. after energy
restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy
restriction vs. after weight stabilization (WS)’ based on the EHMN model.
73
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
WS
KEGG ID
Metabolite name
DI
ER
Rank
P-value
Rank
P-value
962
0.478043
0.0368215
Rank
121
P-value
C00010
CoA
1
0.0000442993
0.0241291
C00116
Glycerol
2
0.0000739748
120
84
0.013672
C00022
Pyruvate
3
0.0000846602
1610
0.777097
1758
0.824276
C00124
D-Galactose
4
0.000107323
859
0.422313
2018
0.941676
C00001
H2O
5
0.000130733
663
0.338044
1991
0.928843
CE2432
trans-2-cis,cis-5,8-tetradecatrienoyl-CoA
6
0.000231587
1945
C00100
Propanoyl-CoA
7
0.000292561
C00630
2-Methylpropanoyl-CoA
8
C00149
(S)-Malate
C00256
(R)-Lactate
0.92305
46
0.00477192
706
0.357157
17
0.000862806
0.000310565
893
0.443749
15
0.000663553
9
0.000548462
2084
0.978406
6
10
0.000578654
1610
0.777097
1758
0.0000793772
0.824276
Table A.18: The comparison of the top 10 reporter metabolites between ’after energy restriction vs.
after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and
’before vs. after energy restriction (ER)’ based on the EHMN model.
Recon 1
DI
KEGG ID
Metabolite name
ER
Rank
P-value
Rank
C00214
Thymidine
1
0.00121664
1116
C00430
5-Amino-4-oxopentanoate
2
0.00207101
68
C00365
dUMP
3
0.00225587
361
C00740
D-Serine
4
0.00282043
C00671
(S)-3-Methyl-2-oxopentanoate
5
C00141
3-Methyl-2-oxobutanoate
C00233
WS
P-value
0.898702
Rank
P-value
99
0.0490703
0.0314408
124
0.0762054
0.300767
583
0.506843
216
0.161373
158
0.105996
0.00311572
376
0.31599
345
0.287947
5
0.00311572
376
0.31599
345
0.287947
4-Methyl-2-oxopentanoate
5
0.00311572
376
0.31599
345
0.287947
C00153
Nicotinamide
6
0.00403591
379
0.319802
337
0.279185
C00108
Anthranilate
7
0.00444249
935
0.782736
27
0.0105086
C05653
N-Formylanthranilate
7
0.00444249
935
0.782736
27
0.0105086
C00427
Prostaglandin H2
8
0.0049088
113
0.0717896
130
0.0792916
C00147
Adenine
9
0.00650209
320
0.262762
242
0.181477
C00262
Hypoxanthine
9
0.00650209
419
0.350411
935
0.763497
C00294
Inosine
9
0.00650209
524
0.456885
171
0.121001
C00672
2-Deoxy-D-ribose 1-phosphate
10
0.00752422
662
0.576983
685
0.578786
Table A.19: The comparison of the top 10 reporter metabolites between ’before dietary intervention vs.
after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after energy restriction
vs. after weight stabilization (WS)’ based on the Recon 1 model.
ER
KEGG ID
Metabolite name
DI
Rank
P-value
Rank
WS
P-value
Rank
P-value
C00004
Nicotinamide adenine dinucleotide - reduced
1
0.000188955
558
0.461106
54
0.0193415
C00122
Fumarate
2
0.000196622
1017
0.845368
10
0.00409946
C00003
Nicotinamide adenine dinucleotide
3
0.000201739
469
0.371783
85
0.0400579
C00149
L-Malate
4
0.000416562
1181
0.966363
33
0.0127139
tetracosahexaenoyl coenzyme A
5
0.000425598
610
0.502014
5
0.000983564
C00510
Octadecenoyl-CoA (n-C18:1CoA)
6
0.000563763
401
0.318402
4
0.000833699
C16218
trans-Octadec-2-enoyl-CoA
6
0.000563763
1027
0.854178
244
0.186712
vaccenyl coenzyme A
6
0.000563763
1027
0.854178
244
0.186712
triacylglycerol (homo sapiens)
7
0.000585207
1093
0.902535
972
0.792624
R total Coenzyme A
8
0.00117564
428
0.339875
43
0.0169446
C01181
4-Trimethylammoniobutanoate
9
0.00138316
1078
0.895337
60
0.0235543
C00266
Glycolaldehyde
10
0.00142146
230
0.166069
269
C00422
0.207325
Table A.20: The comparison of the top 10 reporter metabolites between ’before vs. after energy
restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and ’after energy
restriction vs. after weight stabilization (WS)’ based on the Recon 1 model.
74
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
WS
KEGG ID
Metabolite name
DI
Rank
P-value
Rank
ER
P-value
Rank
P-value
C00422
triacylglycerol (homo sapiens)
1
0.000434381
1093
0.902535
714
0.629066
C00665
D-Fructose 2,6-bisphosphate
2
0.000459701
160
0.106753
122
0.0742671
C00412
Stearoyl-CoA (n-C18:0CoA)
3
0.000772334
378
0.297258
20
C00510
Octadecenoyl-CoA (n-C18:1CoA)
4
0.000833699
401
0.318402
6
C16218
trans-Octadec-2-enoyl-CoA
4
0.000833699
1027
0.854178
41
0.0179762
vaccenyl coenzyme A
4
0.000833699
1027
0.854178
41
0.0179762
tetracosahexaenoyl coenzyme A
5
0.000983564
610
0.502014
5
C00681
lysophosphatidic acid (homo sapiens)
6
0.00173502
534
0.433646
13
C00581
Guanidinoacetate
7
0.00185154
51
C01149
4-Trimethylammoniobutanal
8
0.00303461
983
0.81522
C00671
(S)-3-Methyl-2-oxopentanoate
9
0.00357221
209
0.150242
376
0.31599
C00141
3-Methyl-2-oxobutanoate
9
0.00357221
209
0.150242
376
0.31599
C00233
4-Methyl-2-oxopentanoate
9
0.00357221
209
0.150242
376
0.31599
C00122
Fumarate
10
0.00409946
1017
0.845368
2
0.0371007
139
15
0.00530395
0.000563763
0.000425598
0.00260034
0.0892602
0.00343611
0.000196622
Table A.21: The comparison of the top 10 reporter metabolites between ’after energy restriction vs.
after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization (DI)’, and
’before vs. after energy restriction (ER)’ based on the Recon 1 model.
75
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.2
Expression data from human adipose tissue (GSE15773)
Determination of gene expression signatures of omental and subcutaneous tissue samples [15]:
- 5 insulin resistant probands
- 5 insulin sensitive probands
- Insulin-resistant probands and insulin-sensitive probands were paired by their Body-Mass-Index
- One sample of subcutaneous and omental adipose tissue of each proband
The following comparisons were applied for the calculation of the differentially expressed genes:
(i) insulin resistant against insulin sensitive omental tissue and (ii) insulin resistant against insulin
sensitive subcutaneous tissue.
A.2.1
Differentially expressed genes
The following tables show the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Insulin resistant vs. insulin sensitive omental tissue
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
54715
NM 001142333
ataxin 2-binding
1.33239845066732e-05
protein 1
NM 001142334
NM 018723
NM 145891
NM 145892
NM 145893
2
64102
tenomodulin
2.71410735293089e-05
TL132 protein
5.61926014343038e-05
ubiquitin protein
9.08837395856669e-05
NM 022144
3
220594
NR 003554
4
89910
NM 130466
ligase E3B
hsa04120: Ubiquitin
mediated proteolysis
NM 183415
5
1844
NM 004418
dual specificity
0.000165601301054344
phosphatase 2
hsa04010: Mitogenactivated protein kinase
(MAPK) signaling pathway
6
2354
NM 001114171
NM 006732
FBJ murine osteo-
0.000165704687771452
hsa04380: Osteoclast
sarcoma viral
differentiation
oncogene homolog
hsa05030: Cocaine
B
addiction
hsa05031: Amphetamine
addiction
hsa05034: Alcoholism
7
8418
NR 002174
cytidine mono-
0.000169514884785619
phosphate-Nacetylneuraminic
acid hydroxylase
(CMP-N-acetyl
neuraminate
monooxygenase)
pseudogene
8
2098
NM 001984
Full length insert
0.000178493462148643
cDNA clone
YP41C11
76
C00469
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
9
114836
NM 052931
10
26045
NM 015564
SLAM family
0.000204923151414557
member 6
leucine rich repeat
0.000220613989325125
transmembrane
neuronal 2
Table A.22: The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin
sensitive omental tissue’ with the corresponding pathways and reporter metabolites.
Insulin resistant vs. insulin sensitive subcutaneous tissue
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
84281
NM 001042519
hypothetical
4.42412124189397e-05
protein MGC13057
NM 001042520
NM 001042521
NM 032321
2
9096
T-box 18
9.63669135993597e-05
chromosome 12
0.000127217462105046
NM 001080508
3
80763
NM 030572
hsa04360: Axon guidance
open reading frame
39
4
1948
ephrin-B2
0.000142267737510861
cyclin B1
0.000166836605789672
NM 004093
5
891
NM 031966
hsa04110: Cell cycle
hsa04114: Oocyte meiosis
hsa04115: p53 signaling
pathway
hsa04914: Progesteronemediated oocyte
maturation
6
84749
NM 032663
7
ubiquitin specific
0.000193508782961701
peptidase 30
10335
murine retrovirus
NM 001100163
integration site 1
NM 001100167
homolog
0.00026442219404752
hsa04270: Vascular smooth
muscle contraction
NM 130385
NM 001098579
8
440279
NM 001080534
9
389722
XM 927067
unc-13 homolog C
0.00030208007761224
(C. elegans)
similar to cell re-
hsa04721: Synaptic vesicle
cycle
0.000313536251566601
cognition molecule
CASPR3
10
220594
TL132 protein
0.000371949789481665
NR 003554
Table A.23: The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ with the corresponding pathways and reporter metabolites.
A.2.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
77
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Insulin resistant vs. insulin sensitive omental tissue
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
C00058
Formate
1
0.0000345065
777
0.343703
C01031
S-Formylglutathione
2
0.00415728
775
0.343151
C02934
3-Dehydrosphinganine
3
0.014462
C00083
Malonyl-CoA
4
C00051
Reduced glutathione
C00064
Recon 1 model
Rank
P-value
2
0.00379778
4
0.00432754
97
0.0530286
36
0.0335574
0.0296929
146
0.0721345
67
0.0545762
5
0.0302962
1539
0.685389
156
0.117349
L-Glutamine
6
0.0384255
1139
0.497377
129
0.0988624
C00033
Acetate
7
0.0500906
970
0.427174
319
0.244038
C00186
L-Lactate
7
0.0500906
123
0.0603206
1007
0.784773
C00006
Nicotinamide adenine dinucleotide phosphate
8
0.050133
418
0.185274
46
0.0390321
C00005
Nicotinamide adenine dinucleotide phosphate
8
0.050133
1103
0.482451
38
0.0342608
9
0.0589676
1534
0.683081
761
0.567991
284
0.130459
680
0.50305
- reduced
C00013
Diphosphate
C00080
H+
10
0.060263
Table A.24: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
omental tissue’ using the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C01416
Cocaine
1
0.000450126
NA
NA
808
0.606428
C12448
Ecgonine methyl ester
1
0.000450126
NA
NA
801
0.598147
C14818
Fe2+
2
0.00172031
NA
NA
NA
NA
C14819
Fe3+
2
0.00172031
NA
NA
820
0.614619
C00145
Thiol
3
0.00215296
NA
NA
NA
NA
C00496
Ubiquitin
3
0.00215296
NA
NA
NA
NA
C04090
Ubiquitin C-terminal thiolester
3
0.00215296
NA
NA
NA
NA
C00346
Ethanolamine phosphate
4
0.00315727
NA
NA
201
0.155935
C00029
UDPglucose
5
0.00355117
115
0.34338
445
0.343342
C00097
L-Cysteine
6
0.00366629
146
0.432218
43
0.0382238
C04419
Carboxybiotin-carboxyl-carrier protein
7
0.00455038
NA
NA
NA
NA
C06250
Holo-[carboxylase]
7
0.00455038
NA
NA
NA
NA
CE6241
S-(9-deoxy-delta12-PGD2)-glutathione
8
0.00551476
NA
NA
NA
NA
CE6243
S-(9-deoxy-delta9,12-PGD2)-glutathione
8
0.00551476
NA
NA
NA
NA
C04549
1-Phosphatidyl-1D-myo-inositol 3-phosphate
9
0.00651937
NA
NA
731
0.546049
CE5132
1-phosphatidyl-myo-inositol 3,5-bisphosphate
9
0.00651937
NA
NA
NA
NA
C05959
11-epi-Prostaglandin F2alpha
10
0.00655607
NA
NA
NA
NA
C00639
Prostaglandin F2alpha
10
0.00655607
NA
NA
1196
CE6244
S-(11-hydroxy-9-deoxy-delta12-PGD2)-
10
0.00655607
NA
NA
NA
NA
10
0.00655607
NA
NA
NA
NA
0.963089
glutathione
CE6245
S-(11-OH-9-deoxy-delta9,12-PGD2)glutathione
Table A.25: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
omental tissue’ using the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
EHMN model
Rank
P-value
C15519
25-Hydroxycholesterol
1
0.00128471
NA
NA
NA
NA
C00058
Formate
2
0.00379778
237
0.70459
777
0.343703
C19586
8,9 epxoy aflatoxin B1
3
0.00390178
NA
NA
NA
NA
C06800
aflatoxin B1
3
0.00390178
NA
NA
NA
NA
C01031
S-Formylglutathione
4
0.00432754
2
0.00415728
775
0.343151
C06423
octanoate (n-C8:0)
5
0.0047828
0.258549
NA
NA
C05100
3-Ureidoisobutyrate
6
0.00681573
NA
NA
255
0.121417
C01205
D-3-Amino-isobutanoate
6
0.00681573
NA
NA
1377
0.601485
C00001
H2O
7
0.0094336
63
0.22086
1150
0.504173
C00114
Choline
8
0.00972819
71
0.257347
639
0.282791
C00469
Ethanol
9
0.00981991
14
0.0760583
C00010
Coenzyme A
10
0.011339
73
215
0.662553
52
352
0.0307616
0.160896
Table A.26: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
omental tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN model.
78
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Insulin resistant vs. insulin sensitive subcutaneous tissue
Adipocyte model
KEGG ID
C04640
Metabolite name
2-(Formamido)-N1-(5-phospho-D-
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
1
0.0051115
NA
NA
5
0.00677677
1
0.0051115
NA
NA
5
0.00677677
ribosyl)acetamidine
C04376
N2-Formyl-N1-(5-phospho-Dribosyl)glycinamide
C00311
Isocitrate
2
0.0139724
26
0.0163868
463
0.347262
C03090
5-Phospho-beta-D-ribosylamine
3
0.0194708
36
0.021962
25
0.021535
C00158
Citrate
4
0.0198505
1507
0.729585
1212
0.970628
C00013
Diphosphate
5
0.0217586
783
0.378562
997
C00279
D-Erythrose 4-phosphate
6
0.0266967
382
0.179697
32
0.0283795
C05382
Sedoheptulose 7-phosphate
6
0.0266967
797
0.386939
32
0.0283795
C02679
dodecanoate (C12:0)
7
0.0301599
152
0.0813766
701
0.544609
Eicosanoate (n-C20:0)
7
0.0301599
NA
NA
NA
NA
C00249
hexadecanoate (n-C16:0)
7
0.0301599
1516
0.733581
401
0.300038
C01530
octadecanoate (n-C18:0)
7
0.0301599
1038
0.4949
622
0.482883
pentadecanoate (C15:0)
7
0.0301599
NA
NA
NA
NA
C06424
tetradecanoate (C14:0)
7
0.0301599
152
0.0813766
252
0.198748
C00143
5,10-Methylenetetrahydrofolate
8
0.0307075
1841
C03838
N1-(5-Phospho-D-ribosyl)glycinamide
9
0.0378884
C03232
3-Phosphohydroxypyruvate
10
0.0411572
0.8083
0.885839
57
0.0452544
NA
NA
55
0.0435885
500
0.222261
51
0.0415425
Table A.27: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
subcutaneous tissue’ using the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C01190
Glucosylceramide
1
0.000634892
NA
NA
1
CE2435
trans,cis-deca-2,4-dienoyl-CoA
2
0.00345956
NA
NA
NA
NA
C00018
Pyridoxal phosphate
3
0.00392722
NA
NA
NA
NA
C00534
Pyridoxamine
3
0.00392722
NA
NA
558
0.421529
C00647
Pyridoxamine phosphate
3
0.00392722
NA
NA
NA
NA
C00314
Pyridoxine
3
0.00392722
NA
NA
558
0.421529
C00627
Pyridoxine phosphate
3
0.00392722
NA
NA
NA
NA
CE5787
kinetensin 1-3
4
0.00443797
NA
NA
NA
NA
C00105
UMP
5
0.00502282
242
0.733361
251
0.198241
C00145
Thiol
6
0.0052974
NA
NA
NA
NA
C00496
Ubiquitin
6
0.0052974
NA
NA
NA
NA
C04090
Ubiquitin C-terminal thiolester
6
0.0052974
NA
NA
NA
NA
C00439
N-Formimino-L-glutamate
7
0.0061186
NA
NA
287
0.222142
C00026
2-Oxoglutarate
8
0.00751671
196
0.584116
331
0.253392
C00100
Propanoyl-CoA
9
0.00752986
44
0.107457
466
0.351782
G00088
(Gal)3 (Glc)1 (GlcNAc)2 (Neu5Ac)1 (Cer)1
10
0.00866419
NA
NA
6
0.000661951
0.00888409
Table A.28: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
subcutaneous tissue’ using the EHMN model in comparison to the adipocyte and Recon 1 model.
79
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Recon1 model
KEGG ID
Adipocyte model
Metabolite name
Rank
P-value
Rank
P-value
EHMN model
Rank
ceramide (homo sapiens)
1
0.000661951
NA
NA
C01190
glucocerebroside (homo sapiens)
1
0.000661951
NA
NA
C00083
Malonyl-CoA
2
0.000721689
57
0.129054
167
0.0879732
C00311
Isocitrate
3
0.00444189
230
0.676033
26
0.0163868
4
0.00524946
NA
NA
NA
NA
heparan sulfate, degradation product 16
4
0.00524946
NA
NA
NA
NA
heparan sulfate, degradation product 22
4
0.00524946
NA
NA
NA
NA
2-(Formamido)-N1-(5-phospho-D-
5
0.00677677
1
0.0051115
NA
NA
5
0.00677677
1
0.0051115
NA
NA
10
0.00866419
chondroitin
sulfate
B
/
dermatan
sulfate
684
P-value
C00195
0.32046
1
0.000634892
(IdoA2S-GalNAc4S), degradation product 4
C04640
ribosyl)acetamidine
C04376
N2-Formyl-N1-(5-phospho-Dribosyl)glycinamide
G00088
VI3NeuAc-nLc6Cer
6
0.00888409
NA
NA
C00013
Diphosphate
7
0.00986705
280
0.866302
783
0.378562
C00318
L-Carnitine
8
0.0102979
55
0.122451
320
0.153054
2-Decaprenyl-3-methyl-5-hydroxy-6-methoxy-
9
0.010683
NA
NA
NA
NA
0.0107119
242
0.733361
1,4-benzoquinone
C00105
UMP
10
5
0.00502282
Table A.29: The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
subcutaneous tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN model.
A.2.3
Comparison of expression data
The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model.
Adipocyte
omental
KEGG ID
subcutaneous
Metabolite name
Rank
P-value
Rank
P-value
C00058
Formate
1
0.0000345065
249
0.762469
C01031
S-Formylglutathione
2
0.00415728
108
0.286822
C02934
3-Dehydrosphinganine
3
0.014462
28
0.0758283
C00083
Malonyl-CoA
4
0.0296929
57
0.129054
C00051
Reduced glutathione
5
0.0302962
132
0.343607
C00064
L-Glutamine
6
0.0384255
247
0.750823
C00033
Acetate
7
0.0500906
262
0.800853
C00186
L-Lactate
7
0.0500906
17
0.0548723
C00006
Nicotinamide adenine dinucleotide phosphate
8
0.050133
16
0.0539618
C00005
Nicotinamide adenine dinucleotide phosphate - reduced
8
0.050133
16
0.0539618
C00013
Diphosphate
9
0.0589676
280
0.866302
C00080
H+
0.060263
205
0.59862
10
Table A.30: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the
adipocyte model.
80
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
subcutaneous
KEGG ID
omental
Metabolite name
Rank
P-value
Rank
P-value
C04640
2-(Formamido)-N1-(5-phospho-D-ribosyl)acetamidine
1
0.0051115
149
0.43856
C04376
N2-Formyl-N1-(5-phospho-D-ribosyl)glycinamide
1
0.0051115
149
0.43856
C00311
Isocitrate
2
0.0139724
29
0.121157
C03090
5-Phospho-beta-D-ribosylamine
3
0.0194708
30
0.121373
C00158
Citrate
4
0.0198505
299
0.954296
C00013
Diphosphate
5
0.0217586
261
0.798916
C00279
D-Erythrose 4-phosphate
6
0.0266967
24
0.109783
C05382
Sedoheptulose 7-phosphate
6
0.0266967
24
0.109783
C02679
dodecanoate (C12:0)
7
0.0301599
56
0.206137
Eicosanoate (n-C20:0)
7
0.0301599
56
0.206137
C00249
hexadecanoate (n-C16:0)
7
0.0301599
56
0.206137
C01530
octadecanoate (n-C18:0)
7
0.0301599
91
0.292422
pentadecanoate (C15:0)
7
0.0301599
56
0.206137
C06424
tetradecanoate (C14:0)
7
0.0301599
56
0.206137
C00143
5,10-Methylenetetrahydrofolate
8
0.0307075
168
0.485562
C03838
N1-(5-Phospho-D-ribosyl)glycinamide
9
0.0378884
97
C03232
3-Phosphohydroxypyruvate
10
0.0411572
127
0.30705
0.376986
Table A.31: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the
adipocyte model.
EHMN
omental
KEGG ID
subcutaneous
Metabolite name
Rank
P-value
Rank
P-value
C01416
Cocaine
1
0.000450126
1657
0.804592
C12448
Ecgonine methyl ester
1
0.000450126
1657
0.804592
C14818
Fe2+
2
0.00172031
276
0.131285
C14819
Fe3+
2
0.00172031
714
0.336101
C00145
Thiol
3
0.00215296
602
0.276338
C00496
Ubiquitin
3
0.00215296
1368
0.655487
C04090
Ubiquitin C-terminal thiolester
3
0.00215296
602
0.276338
C00346
Ethanolamine phosphate
4
0.00315727
116
0.0621875
C00029
UDPglucose
5
0.00355117
100
0.0517659
C00097
L-Cysteine
6
0.00366629
934
0.447135
C04419
Carboxybiotin-carboxyl-carrier protein
7
0.00455038
701
0.331411
C06250
Holo-[carboxylase]
7
0.00455038
76
CE6241
S-(9-deoxy-delta12-PGD2)-glutathione
8
0.00551476
1219
0.577494
CE6243
S-(9-deoxy-delta9,12-PGD2)-glutathione
8
0.00551476
1219
0.577494
C04549
1-Phosphatidyl-1D-myo-inositol 3-phosphate
9
0.00651937
1235
0.584706
CE5132
1-phosphatidyl-myo-inositol 3,5-bisphosphate
9
0.00651937
1846
0.889204
C05959
11-epi-Prostaglandin F2alpha
10
0.00655607
881
C00639
Prostaglandin F2alpha
10
0.00655607
1237
CE6244
S-(11-hydroxy-9-deoxy-delta12-PGD2)-glutathione
10
0.00655607
881
0.42114
CE6245
S-(11-OH-9-deoxy-delta9,12-PGD2)-glutathione
10
0.00655607
881
0.42114
0.0410101
0.42114
0.585272
Table A.32: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the
EHMN model.
81
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
subcutaneous
KEGG ID
omental
Metabolite name
Rank
P-value
C01190
Glucosylceramide
1
0.000634892
CE2435
trans,cis-deca-2,4-dienoyl-CoA
2
C00018
Pyridoxal phosphate
3
C00534
Pyridoxamine
C00647
Rank
P-value
37
0.0207825
0.00345956
68
0.0373244
0.00392722
1064
0.466204
3
0.00392722
1064
0.466204
Pyridoxamine phosphate
3
0.00392722
1064
0.466204
C00314
Pyridoxine
3
0.00392722
1064
0.466204
C00627
Pyridoxine phosphate
3
0.00392722
1064
0.466204
CE5787
kinetensin 1-3
4
0.00443797
154
0.0758005
C00105
UMP
5
0.00502282
959
0.422072
C00145
Thiol
6
0.0052974
1128
0.491042
C00496
Ubiquitin
6
0.0052974
640
0.283103
C04090
Ubiquitin C-terminal thiolester
6
0.0052974
1128
0.491042
C00439
N-Formimino-L-glutamate
7
0.0061186
200
0.0966571
C00026
2-Oxoglutarate
8
0.00751671
574
0.254968
C00100
Propanoyl-CoA
9
0.00752986
1224
0.533772
G00088
(Gal)3 (Glc)1 (GlcNAc)2 (Neu5Ac)1 (Cer)1
10
0.00866419
726
0.323394
Table A.33: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the
EHMN model.
Recon 1
omental
KEGG ID
subcutaneous
Metabolite name
Rank
P-value
Rank
P-value
C15519
25-Hydroxycholesterol
1
0.00128471
233
0.186681
C00058
Formate
2
0.00379778
239
0.191749
C19586
8,9 epxoy aflatoxin B1
3
0.00390178
763
0.604782
C06800
aflatoxin B1
3
0.00390178
763
0.604782
C01031
S-Formylglutathione
4
0.00432754
325
0.244277
C06423
octanoate (n-C8:0)
5
0.0047828
C05100
3-Ureidoisobutyrate
6
0.00681573
235
0.188754
C01205
D-3-Amino-isobutanoate
6
0.00681573
235
0.188754
C00001
H2O
7
0.0094336
326
0.24433
C00114
Choline
8
0.00972819
859
0.695636
C00469
Ethanol
9
0.00981991
1188
0.95315
C00010
Coenzyme A
599
0.46389
10
0.011339
66
0.0521972
Table A.34: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’ based on the
Recon 1 model.
subcutaneous
KEGG ID
omental
Metabolite name
Rank
P-value
Rank
P-value
C00195
ceramide (homo sapiens)
1
0.000661951
22
0.0215455
C01190
glucocerebroside (homo sapiens)
1
0.000661951
22
0.0215455
C00083
Malonyl-CoA
2
0.000721689
67
0.0545762
C00311
Isocitrate
3
0.00444189
93
0.0726228
chondroitin sulfate B / dermatan sulfate (IdoA2S-GalNAc4S), degradation
4
0.00524946
293
0.21864
heparan sulfate, degradation product 16
4
0.00524946
293
0.21864
heparan sulfate, degradation product 22
4
0.00524946
293
0.21864
C04640
2-(Formamido)-N1-(5-phospho-D-ribosyl)acetamidine
5
0.00677677
625
0.466608
C04376
N2-Formyl-N1-(5-phospho-D-ribosyl)glycinamide
5
0.00677677
625
0.466608
G00088
VI3NeuAc-nLc6Cer
6
0.00888409
431
0.332104
C00013
Diphosphate
7
0.00986705
761
0.567991
C00318
L-Carnitine
8
0.0102979
515
0.396884
2-Decaprenyl-3-methyl-5-hydroxy-6-methoxy-1,4-benzoquinone
9
0.010683
918
0.703013
0.0107119
395
0.305826
product 3
C00105
UMP
10
Table A.35: The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’ based on the
Recon 1 model.
82
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.3
Genome-wide analysis of adipose tissue gene expression
in twin-pairs discordant for physical activity for over 30
years (GSE20536)
Biopsy samples of adipose tissue from twin pairs that had been followed for their discordance for
physical activity for 32 years [91]:
- Two mono- and four dizygotic twins
- Paired sample per twin pair
The comparison of the twin pairs, active against non-active, is used for the calculation of the differentially expressed genes.
A.3.1
Differentially expressed genes
The following table shows the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Active vs. non-active
Rank
EntrezID
GeneName
P-value
Pathway
2.50889117574308e-05
hsa04610: Complement and
RefSeqID
1
2162
NM 000129.2
Homo sapiens coagulation factor
coagulation cascades
XIII,A1 polypeptide (F13A1),
mRNA.
2
55154
NM 018116.2
Homo sapiens
6.80124299891444e-05
misato homolog 1
(Drosophila)
(MSTO1), mRNA.
3
6584
Homo sapiens so-
NM 003060.2
lute carrier family
0.000207549280179673
22 (organic cation
transporter), member 5 (SLC22A5),
mRNA.
4
440349
PREDICTED:
XM 496129.2
Homo sapiens
0.00021302455892714
similar to nuclear
pore complex
interacting protein, transcript
variant 1 (LOC
440349), mRNA.
5
9270
NM 004763.2
Homo sapiens
0.000320744391026067
integrin beta 1
binding protein 1
(ITGB1BP1),
transcript variant
1, mRNA.
6
81493
NM 030786.1
Homo sapiens
0.000422488677668484
syncoilin, intermediate filament 1
(SYNC1), mRNA.
83
Adipocyte
EHMN
Recon1
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
7
2745
NM 002064.1
Homo sapiens
0.000488692163262734
glutaredoxin
(thioltransferase)
(GLRX), mRNA.
8
57129
Homo sapiens
NM 020409.2
mitochondrial
0.000533011668188304
ribosomal protein
L47 (MRPL47),
nuclear gene encoding mitochondrial protein,
transcript variant
1, mRNA.
9
6892
NM 172209.1
Homo sapiens TAP
0.000575535751947747
hsa04612: Antigen pro-
binding protein
cessing and presentation
(tapasin),(TAPBP)
transcript variant
3, mRNA.
10
10163
Homo sapiens
NM 006990.2
WAS protein
0.000642954970236917
hsa04520: Adherens
junction
family, member 2
hsa04666: Fc gamma R-
(WASF2), mRNA.
mediated phagocytosis
hsa04810: Regulation of
actin cytoskeleton
ha05100: Bacterial invasion of epithelial cells
hsa05131: Shigellosis
hsa05132: Salmonella
infection
Table A.36: The top 10 differentially expressed genes from the comparison ’active vs. non-active’ with
the corresponding pathways and reporter metabolites.
A.3.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
Active vs. non-active
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C00008
ADP
1
0.00167407
523
0.228823
324
0.276925
C00016
FAD
2
0.0146813
597
0.248455
117
0.111056
C01352
FADH2
2
0.0146813
808
0.336281
117
0.111056
C04895
2-Amino-4-hydroxy-6-(erythro-1,2,3-
3
0.0164251
274
0.121031
24
0.0651738
trihydroxypropyl)dihydropteridine
0.0261545
triphos-
phate
C00002
ATP
4
0.0197332
136
105
0.100173
C00091
Succinyl-CoA
5
0.0277063
1637
0.741003
95
0.092701
C00440
5-Methyltetrahydrofolate
6
0.0285175
1945
0.902636
694
0.581199
C00042
Succinate
7
0.0313222
780
0.327249
644
0.538272
C01236
6-phospho-D-glucono-1,5-lactone
8
0.0423082
111
0.0577854
205
0.182101
C00164
Acetoacetate
9
0.048282
236
0.107325
81
C03912
1-Pyrroline-5-carboxylate
10
0.0487278
225
0.103803
123
0.11462
C00148
L-Proline
10
0.0487278
2048
0.95538
661
0.550038
0.0838792
Table A.37: The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the
adipocyte model in comparison to the EHMN and Recon 1 model.
84
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
1237
P-value
C00047
L-Lysine
1
0.00196101
NA
NA
0.992782
C01094
D-Fructose 1-phosphate
2
0.00262318
NA
NA
5
C00518
Hyaluronate
3
0.00534445
NA
NA
916
0.759486
C00167
UDPglucuronate
3
0.00534445
NA
NA
912
0.755742
C03391
DNA 6-methylaminopurine
4
0.00566266
NA
NA
NA
NA
C00821
DNA adenine
4
0.00566266
NA
NA
NA
NA
C06893
2-Deoxy-5-keto-D-gluconic acid 6-phosphate
5
0.00709321
NA
NA
NA
NA
C00222
3-Oxopropanoate
5
0.00709321
NA
NA
337
0.286842
CE1186
D-xylulose-1-phosphate
5
0.00709321
NA
NA
NA
NA
C00111
Glycerone phosphate
5
0.00709321
72
0.266625
67
0.0720443
C00266
Glycolaldehyde
5
0.00709321
NA
NA
287
0.249745
CE2054
20-carboxy-leukotriene-B4
6
0.00754237
NA
NA
NA
NA
C00301
ADPribose
7
0.00838736
NA
NA
57
0.0608186
C00310
D-Xylulose
8
0.00843894
NA
NA
143
0.126706
C00318
L-Carnitine
9
0.00858839
288
0.970025
614
0.51831
C02839
L-Tyrosyl-tRNA(Tyr)
10
0.0106306
NA
NA
NA
NA
C00787
tRNA(Tyr)
10
0.0106306
NA
NA
NA
NA
0.00302379
Table A.38: The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the EHMN
model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
EHMN model
Rank
P-value
D-Tagatose 1-phosphate
1
0.000367843
NA
NA
NA
NA
C00072
L-Ascorbate
2
0.00095762
NA
NA
855
0.363183
C00584
Prostaglandin E2
3
0.00236753
NA
NA
207
0.0934877
C00577
D-Glyceraldehyde
4
0.00263828
NA
NA
96
0.0506967
C01094
D-Fructose 1-phosphate
5
0.00302379
NA
NA
2
D-Xylulose 1-phosphate
5
0.00302379
NA
NA
NA
hyaluronan
6
0.00591994
NA
NA
3
R total 2 coenzyme A
7
0.00671575
NA
NA
NA
NA
pristanic acid
8
0.00852173
NA
NA
NA
NA
C00795
D-Tagatose
9
0.0092775
NA
NA
NA
NA
C00301
ADPribose
10
0.00945689
NA
NA
109
0.0571841
C00518
0.00262318
NA
0.00534445
Table A.39: The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the Recon
1 model in comparison to the adipocyte and EHMN model.
85
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.4
Differences in subcutaneous adipose tissue gene expression between obese African Americans and Hispanic Youths
(GSE23506)
Cross-sectional study design to compare subcutaneous adipose tissue gene expression profiles [92]:
- 17 African Americans
- 19 Hispanics
The comparison of the African Americans against Hispanics is used for the calculation of the differentially expressed genes.
A.4.1
Differentially expressed genes
The following table shows the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
African Americans vs. Hispanics
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
6228
NM 001025.4
Homo sapiens
4.66501824511915e-07
hsa03010: Ribosome
ribosomal protein
S23 (RPS23),
mRNA.
2
642934
PREDICTED:
XM 942991.2
Homo sapiens
5.37782222890024e-07
hypothetical
LOC642934
(LOC642934),
mRNA.
3
6428
NM 003017.3
Homo sapiens
4.1302884682886e-05
hsa03040: Spliceosome
splicing factor,
hsa05168: Herpes simplex
arginine/serine-
infection
rich 3 (SFRS3),
mRNA.
4
10901
NM 021004.2
Homo sapiens
4.68304117637516e-05
hsa00830: Retinol
dehydrogenase/
metabolism
reductase (SDR
hsa01100: Metabolic
C00011
C00191
C00025
family) member 4
pathways
C00091
C00584
C00052
C00214
C00639
C00058
C00337
C00641
C00209
C00363
C02165
C00427
C00364
C05455
C00581
(DHRS4), mRNA.
C00864
C01194
C01346
C05635
C05951
hsa04146: Peroxisome
5
6231
NM 001029.3
Homo sapiens
5.1153372883841e-05
ribosomal protein
S26 (RPS26),
mRNA.
86
hsa03010: Ribosome
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
6
222967
NM 173565.1
Homo sapiens
5.82154912734042e-05
hypothetical protein LOC222967
(LOC222967),
mRNA.
7
26254
NM 014359.3
Homo sapiens
6.85759996704967e-05
opticin (OPTC),
mRNA.
8
650646
PREDICTED:
XM 942527.2
Homo sapiens
9.7940554821229e-05
similar to 40S
ribosomal protein
S26 (LOC650646),
mRNA.
9
84632
NM 032550.2
Homo sapiens
0.000104322064758533
actin filament associated protein1like 2 (AFAP1L2),
transcript variant 2, mRNA.
10
6710
Homo sapiens
NM 001024858.1
spectrin, beta,
0.000106741579871336
erythrocytic
(includes
spherocytosis,
clinical type I)
(SPTB), transcript
variant 1, mRNA.
Table A.40: The top 10 differentially expressed genes from the comparison ’African Americans vs.
Hispanics’ with the corresponding pathways and reporter metabolites.
A.4.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
African Americans vs. Hispanics
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
1211
P-value
0.550168
Recon 1 model
Rank
CO2
1
0.00955117
C00288
Bicarbonate
2
0.0100439
70
8
0.0102825
C00337
(S)-Dihydroorotate
3
0.0195315
290
0.142493
46
0.0372433
(S)-Methylmalonate semialdehyde
4
0.0201632
470
0.211569
NA
NA
Methylmalonate
4
0.0201632
470
0.211569
NA
NA
C00864
(R)-Pantothenate
5
0.0220845
73
0.0358939
44
0.0359254
C01346
dUDP
6
0.0245557
1152
0.521576
708
0.539504
C00364
dTMP
7
0.0292051
1696
0.795349
121
0.0995272
C00363
dTDP
8
0.0352166
1152
0.521576
392
0.299371
C00214
Thymidine
9
0.0368675
1783
0.837454
409
0.311549
C00091
Succinyl-CoA
10
0.0410171
927
0.41604
489
0.366924
0.0349696
353
P-value
C00011
0.265944
Table A.41: The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using
the adipocyte model in comparison to the EHMN and Recon 1 model.
87
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
CE4988
10,11-dihydro-12-epi-leukotriene B4
1
0.000235915
NA
NA
NA
NA
CE5944
10,11-dihydro-12-oxo-LTB4
1
0.000235915
NA
NA
NA
NA
CE4993
10,11-dihydro-12R-hydroxy-leukotriene C4
1
0.000235915
NA
NA
NA
NA
CE5976
12-oxo-10,11-dihydro-20-COOH-LTB4
1
0.000235915
NA
NA
NA
NA
CE5525
12-oxo-20-carboxy-leukotriene B4
1
0.000235915
NA
NA
NA
NA
CE5947
20-COOH-10,11-dihydro-LTB4
1
0.000235915
NA
NA
NA
NA
C04805
5(S)-HETE
1
0.000235915
NA
NA
NA
NA
CE6246
5,12-DiHETE
1
0.000235915
NA
NA
NA
NA
CE7085
5-HEPE
1
0.000235915
NA
NA
NA
NA
CE2084
5-oxo-(6E,8Z,11Z,14Z)-eicosatetraenoic acid
1
0.000235915
NA
NA
NA
NA
CE7097
5-oxo-12(S)-hydroxy-eicosa-6E,8Z,10E,14Z-
1
0.000235915
NA
NA
NA
NA
tetraenoate
CE5178
5-oxo-6-trans-leukotriene B4
1
0.000235915
NA
NA
NA
NA
CE5349
5-oxo-6E-12-epi-LTB4
1
0.000235915
NA
NA
NA
NA
CE7111
5-oxo-EPE
1
0.000235915
NA
NA
NA
NA
CE5350
6,7-dihydro-12-epi-LTB4
1
0.000235915
NA
NA
NA
NA
CE5352
6,7-dihydro-leukotriene B4
1
0.000235915
NA
NA
NA
NA
CE2445
6-trans-leukotriene B4
1
0.000235915
NA
NA
NA
NA
CE2446
6E-12-epi-LTB4
1
0.000235915
NA
NA
NA
NA
C02165
Leukotriene B4
1
0.000235915
NA
NA
510
0.387886
C00639
Prostaglandin F2alpha
1
0.000235915
NA
NA
133
0.104798
CE5531
12-oxo-c-LTB3
2
0.00220967
NA
NA
NA
NA
CE4990
12-oxo-leukotriene B4
2
0.00220967
NA
NA
NA
NA
C03512
L-Tryptophanyl-tRNA(Trp)
3
0.00245117
NA
NA
NA
NA
C01652
tRNA(Trp)
3
0.00245117
NA
NA
NA
NA
C00584
Prostaglandin E2
4
0.00296839
NA
NA
102
0.0889991
C05457
7alpha,12alpha-Dihydroxycholest-4-en-3-one
5
0.00330511
NA
NA
15
0.0131415
C05455
7alpha-Hydroxycholest-4-en-3-one
5
0.00330511
NA
NA
63
0.051964
C13856
2-Arachidonoylglycerol
6
0.00342886
NA
NA
NA
NA
CE4987
10,11-dihydro-leukotriene B4
7
0.00347994
NA
NA
NA
NA
CE2054
20-carboxy-leukotriene-B4
7
0.00347994
NA
NA
NA
NA
CE5343
6,7-dihydro-5-oxo-12-epi-LTB4
7
0.00347994
NA
NA
NA
NA
CE5179
6,7-dihydro-5-oxo-leukotriene B4
7
0.00347994
NA
NA
NA
NA
C00191
D-Glucuronate
8
0.00378807
NA
NA
68
0.0539488
C00641
1,2-Diacyl-sn-glycerol
9
0.00455281
NA
NA
NA
NA
C00066
tRNA
10
0.00570316
NA
NA
NA
NA
Table A.42: The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using
the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
EHMN model
Rank
815
P-value
C00209
Oxalate
1
0.00146939
NA
NA
C00581
Guanidinoacetate
2
0.00511799
NA
NA
17
0.00943231
C06196
dIMP
3
0.00536709
NA
NA
NA
NA
C00025
L-Glutamate
4
0.00625135
223
0.792349
1860
0.873716
C05951
Leukotriene D4
4
0.00625135
NA
NA
1733
0.814375
C00052
UDPgalactose
5
0.0075734
NA
NA
229
0.114645
C00288
Bicarbonate
6
0.0101064
233
0.833987
C05635
5-Hydroxyindoleacetate
7
0.0103664
NA
NA
1646
0.768745
C00427
Prostaglandin H2
8
0.0105947
NA
NA
774
0.345731
C01194
phosphatidylinositol (homo sapiens)
9
0.0108149
NA
NA
1964
0.913392
C00058
Formate
10
0.0121275
181
0.637771
70
130
0.360454
0.0349696
0.0608291
Table A.43: The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’ using
the Recon 1 model in comparison to the adipocyte and EHMN model.
88
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.5
Subcutaneous adipose tissue: comparison of weight maintenance and weight regain following an 8-week low calorie
diet (GSE24432)
Fourty women followed a dietary protocol consisting of an 8-week low calorie diet (LCD) and a 6-month
weight maintenance phase [93]:
- 20 probands were classified as weight maintainers (WM)
- 20 probands were classified as weight regainers (WR)
- 2 paired samples per person, one before and one after LCD
The following comparisons were applied for the calculation of the differentially expressed genes:
(i) weight maintenance - before low calorie diet vs. after low calorie diet and (ii) weight regainer before low calorie diet vs. after low calorie diet.
A.5.1
Differentially expressed genes
The following tables show the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Weight maintenance - before low calorie diet vs. after low calorie diet
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
C00510
C00154
C00510
C00412
C03035
C00510
C16163
RefSeqID
1
9415
NM 004265
fatty acid
1.11927141264039e-14
desaturase 2
hsa00592: alpha-Linolenic
acid metabolism
hsa01040: Biosynthesis of
unsaturated fatty acids
C02050
C02249
hsa03320: PPAR signaling
pathway
2
10614
NM 006460
hexamethylene
1.73814074790391e-13
bisacetamide
inducible 1
3
54518
amyloid beta (A4)
NM 019043
precursor protein-
1.51774826561344e-12
binding, family B,
member 1 interacting protein
4
2876
NM 201397
glutathione
7.09742698715859e-12
peroxidase 1
hsa00480: Glutathione
metabolism
hsa00590: Arachidonic acid
C04742
metabolism
C04805
C05356
C05965
C05966
hsa05014: Amyotrophic
lateral sclerosis (ALS)
hsa05016: Huntington’s
disease
89
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
C00510
C00154
C00510
C00412
C03035
C00510
C16163
RefSeqID
5
6319
NM 005063
stearoyl-CoA
1.07987914784104e-11
desaturase (delta-
hsa01040: Biosynthesis of
unsaturated fatty acids
9-desaturase)
C02050
C02249
hsa03320: PPAR signaling
pathway
6
60481
NM 021814
ELOVL fatty acid
2.34361061109303e-11
elongase 5
hsa00062: Fatty acid
C05272
elongation
hsa01040: Biosynthesis of
C00154
C05272
C01944
C00510
unsaturated fatty acids
C00154
C00510
C00412
C03035
C00510
C16163
C02050
C02249
7
8
25878
matrix-remodell-
NM 015419
ing associated 5
7280
tubulin, beta 2A
NM 001069
2.6687478989018e-11
3.06228306097627e-11
sa04145: Phagosome
C00007
class IIa
hsa04540: Gap junction
hsa05130: Pathogenic
Escherichia coli infection
9
23531
monocyte to
NM 012329
macrophage
1.49523954543299e-10
differentiationassociated
10
493
NM 001001396
ATPase, Ca++
1.67674757732865e-10
transporting,
hsa04020: Calcium
C00004
signaling pathway
plasma membrane
4
hsa04970: Salivary
secretion
hsa04972: Pancreatic secretion
Table A.44: The top 10 differentially expressed genes from the comparison ’WM - before LCD vs. after
LCD’ with the corresponding pathways and reporter metabolites.
Weight regainer - before low calorie diet vs. after low calorie diet
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
hsa01040: Biosynthesis of
C00154
C00510
unsaturated fatty acids
C00412
RefSeqID
1
9415
NM 004265
fatty acid
2.64051773458427e-11
desaturase 2
hsa00592: alpha-Linolenic
acid metabolism
C00510
C02050
C02249
C03035
C06426
hsa03320: PPAR signaling
pathway
2
10957
NM 006813
proline-rich nuc-
3.3207161581569e-11
lear receptor
coactivator 1
3
3183
NM 031314
heterogeneous
6.77857173789559e-11
nuclear ribonucleoprotein C (C1/C2)
90
hsa03040: Spliceosome
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
hsa01040: Biosynthesis of
C00154
C00510
unsaturated fatty acids
C00412
RefSeqID
4
6319
NM 005063
stearoyl-CoA
1.34364957647302e-10
desaturase (delta9-desaturase)
C00510
C02050
C02249
C03035
C06426
hsa03320: PPAR signaling
pathway
5
6202
NM 001012
6
4869
NM 002520
ribosomal protein
2.69976568398775e-10
hsa03010: Ribosome
S8
nucleophosmin
3.28854625070913e-10
(nucleolar phosphoprotein B23,
numatrin)
7
124
NM 000667
alcohol dehydro-
3.47714796918351e-10
hsa00010: Glycolysis/
genase 1A (class I),
Gluconeogenesis
alpha polypeptide
hsa00071: Fatty acid
C00154
metabolism
hsa00350: Tyrosine
C05576
metabolism
hsa00830: Retinol metabolism
hsa00982: Drug metabolism
8
51429
sorting nexin 9
1.96217224619829e-09
iron-sulfur cluster
2.6234021276621e-09
hsa01100: Metabolic
C00003
C00154
C00003
pathways
C00004
C00222
C00004
C00007
C00447
C00051
C00129
C00577
C00072
C00155
C01094
C00132
C00235
C06426
C00577
C00341
C01094
C00448
C05576
NM 016224
9
23479
NM 014301
scaffold homolog
(E. coli)
10
9775
NM 014740
eukaryotic trans-
2.70101470848801e-09
hsa03013: RNA transport
lation initiation
hsa03015: mRNA
factor 4A3
surveillance pathway
hsa03040: Spliceosome
Table A.45: The top 10 differentially expressed genes from the comparison ’WR - before LCD vs. after
LCD’ with the corresponding pathways and reporter metabolites.
91
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.5.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
Weight maintenance - before low calorie diet vs. after low calorie diet
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
octadecadienoyl-CoA (C18:2CoA, n-6)
1
0.00000292448
NA
NA
NA
NA
eicosadienoyl-CoA (C20:2CoA, n-6)
2
0.0000625222
NA
NA
NA
NA
octadecatrienoyl-CoA (C18:3CoA, n-3)
3
0.0000704046
NA
NA
NA
NA
O2
4
0.000162911
1223
0.692827
486
0.478058
octadecatrienoyl-CoA (C18:3CoA, n-6)
5
0.000181388
NA
NA
NA
NA
stearidonyl coenzyme A (C18:4CoA, n-3)
5
0.000181388
NA
NA
NA
NA
Coenzyme A
6
0.000904182
184
0.0733702
360
0.35624
eicosatetraenoyl-CoA (C20:4CoA, n-6)
7
0.00144942
NA
NA
NA
NA
eicosatrienoyl-CoA (C20:3CoA, n-6)
7
0.00144942
NA
NA
NA
NA
heptadecenoyl CoA (C17:1CoA, n-8)
8
0.00267006
NA
NA
NA
NA
eicosenoyl-CoA (C20:1CoA, n-11)
9
0.00291506
NA
NA
NA
NA
tetradecenoyl-CoA (C14:1CoA, n-5)
9
0.00291506
NA
NA
NA
NA
docosenoyl-CoA (C22:1CoA, n-9)
10
0.00321812
NA
NA
NA
NA
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
10
0.00321812
80
0.0202188
588
0.570928
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
10
0.00321812
79
0.0193181
1
C00007
C00010
0.00000502544
Table A.46: The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’
using the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C02249
Arachidonyl-CoA
1
0.00000324492
NA
NA
227
0.23024
C01794
Choloyl-CoA
2
0.0000594551
NA
NA
774
0.723132
CE2254
docosanoyl-CoA
2
0.0000594551
NA
NA
NA
NA
C00412
Stearoyl-CoA
2
0.0000594551
12
0.0137419
14
0.00383382
CE2257
tetracosanoyl-CoA
2
0.0000594551
NA
NA
NA
NA
C02050
Linoleoyl-CoA
3
0.0000701231
NA
NA
NA
NA
C00510
Oleoyl-CoA
4
0.000318929
12
0.0137419
1
C00154
Palmitoyl-CoA
4
0.000318929
14
0.0147991
52
0.0329838
C05965
12(S)-HPETE
5
0.000330223
NA
NA
NA
NA
C04717
13(S)-HPODE
5
0.000330223
NA
NA
NA
NA
CE2163
13-hydroxy-(9Z,11E)-octadecadienoate
5
0.000330223
NA
NA
NA
NA
C04742
15(S)-HETE
5
0.000330223
NA
NA
NA
NA
C05966
15(S)-HPETE
5
0.000330223
NA
NA
NA
NA
C04805
5(S)-HETE
5
0.000330223
NA
NA
NA
NA
C05356
5(S)-HPETE
5
0.000330223
NA
NA
853
0.796538
C14827
9(S)-HPODE
5
0.000330223
NA
NA
NA
NA
CE2539
9-hydroxyoctadecadienoate
5
0.000330223
NA
NA
NA
NA
CE0852
palmitoleoyl-CoA
6
0.000335923
NA
NA
NA
NA
C03069
3-Methylcrotonyl-CoA
7
0.000899727
53
0.167654
21
0.00930988
CE0713
3-oxolinoleoyl-CoA
8
0.0011461
NA
NA
NA
NA
C00332
Acetoacetyl-CoA
9
0.00153217
231
0.824807
338
0.335567
C01944
Octanoyl-CoA
9
0.00153217
92
0.312845
255
0.252494
C00091
Succinyl-CoA
9
0.00153217
76
0.257895
289
0.282306
C00016
FAD
10
0.00159844
248
0.873929
365
0.359982
C01352
FADH2
10
0.00159844
248
0.873929
365
0.359982
0.00000502544
Table A.47: The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’
using the EHMN model in comparison to the adipocyte and Recon 1 model.
92
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
C00510
Octadecenoyl-CoA (n-C18:1CoA)
1
0.00000502544
12
C16218
trans-Octadec-2-enoyl-CoA
1
0.00000502544
vaccenyl coenzyme A
1
0.00000502544
tetracosahexaenoyl coenzyme A
2
Coenzyme A
P-value
EHMN model
Rank
P-value
0.0137419
79
0.0193181
NA
NA
NA
NA
NA
NA
NA
NA
0.000033348
NA
NA
NA
NA
3
0.000246448
6
0.000904182
184
0.0733702
alpha-Linolenoyl-CoA
4
0.000370386
NA
NA
167
0.065261
gamma-linolenoyl-CoA
4
0.000370386
NA
NA
38
0.00776122
linoelaidyl coenzyme A
4
0.000370386
NA
NA
NA
NA
linoleic coenzyme A
4
0.000370386
NA
NA
NA
NA
stearidonyl coenzyme A
4
0.000370386
NA
NA
NA
NA
tetracosapentaenoyl coenzyme A, n-3
4
0.000370386
NA
NA
NA
NA
C05272
Hexadecenoyl-CoA (n-C16:1CoA)
5
0.00113402
10
0.00321812
80
0.0202188
C00003
Nicotinamide adenine dinucleotide
6
0.00127546
33
0.0961396
191
0.0769088
C00422
triacylglycerol (homo sapiens)
7
0.00127821
NA
NA
639
0.361776
C00004
Nicotinamide adenine dinucleotide - reduced
8
0.0014729
0.0961396
174
0.0684839
C00268
6,7-Dihydrobiopterin
9
0.00157729
NA
NA
112
0.0386679
4-Nitrophenyl sulfate
10
0.0019562
NA
NA
NA
NA
Dopamine 3-O-sulfate
10
0.0019562
NA
NA
NA
NA
C00010
C03035
C16163
C13690
33
Table A.48: The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’
using the Recon 1 model in comparison to the adipocyte and EHMN model.
Weight regainer - before low calorie diet vs. after low calorie diet
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
octadecatrienoyl-CoA (C18:3CoA, n-6)
1
0.000234023
NA
NA
NA
NA
stearidonyl coenzyme A (C18:4CoA, n-3)
1
0.000234023
NA
NA
NA
NA
eicosapentaenoyl-CoA (C20:5CoA, n-3)
2
0.00156232
NA
NA
NA
NA
eicosatetraenoyl-CoA (C20:4CoA, n-3)
2
0.00156232
NA
NA
NA
NA
octadecatrienoyl-CoA (C18:3CoA, n-3)
3
0.00735661
NA
NA
NA
NA
L-Homocysteine
4
0.00774941
38
0.0182404
104
0.0866209
S-(Hydroxymethyl)glutathione
5
0.00786018
14
0.00596045
NA
NA
O2
6
0.0144094
21
0.0102723
362
0.328023
eicosatetraenoyl-CoA (C20:4CoA, n-6)
7
0.0167658
NA
NA
NA
NA
eicosatrienoyl-CoA (C20:3CoA, n-6)
7
0.0167658
NA
NA
NA
NA
C00003
Nicotinamide adenine dinucleotide
8
0.0171684
241
0.107103
2
0.0000982315
C00004
Nicotinamide adenine dinucleotide - reduced
8
0.0171684
253
0.113878
1
0.0000892173
C00235
Dimethylallyl diphosphate
9
0.0194827
1728
0.962407
415
0.382652
C00448
Farnesyl diphosphate
9
0.0194827
618
0.304821
332
0.300464
C00341
Geranyl diphosphate
9
0.0194827
1296
0.721519
33
C00129
Isopentenyl diphosphate
9
0.0194827
1728
0.962407
482
0.460095
10
0.0226904
NA
NA
NA
NA
C00155
C00007
octadecadienoyl-CoA (C18:2CoA, n-6)
0.0163746
Table A.49: The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using
the adipocyte model in comparison to the EHMN and Recon 1 model.
93
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
NA
Recon 1 model
Rank
NA
P-value
C02050
Linoleoyl-CoA
1
0.000133458
NA
NA
C00510
Oleoyl-CoA
2
0.000296351
66
0.269027
4
C00154
Palmitoyl-CoA
2
0.000296351
64
0.253214
203
0.186697
C00412
Stearoyl-CoA
2
0.000296351
66
0.269027
111
0.0907745
C01094
D-Fructose 1-phosphate
3
0.000596177
NA
NA
6
0.000684636
C00577
D-Glyceraldehyde
4
0.00104135
NA
NA
5
0.000678521
C06426
(6Z,9Z,12Z)-Octadecatrienoic acid
5
0.00111288
NA
NA
621
0.577515
C03035
gamma-Linolenoyl-CoA
5
0.00111288
NA
NA
529
0.503971
CE4815
stearidonoyl-CoA
5
0.00111288
NA
NA
NA
NA
CE4824
tetracosa-9,12,15,18,21-all-cis-pentaenoyl-
5
0.00111288
NA
NA
NA
NA
0.00061122
CoA
CE4837
tetracosa-9,12,15,18-all-cis-tetraenoyl-CoA
5
0.00111288
NA
NA
NA
NA
C02249
Arachidonyl-CoA
6
0.00121838
NA
NA
101
0.0836376
CE4809
alpha-linolenoyl-CoA
7
0.00203885
NA
NA
NA
NA
CE4823
tetracosa-6,9,12,15,18,21-all-cis-hexaenoyl-
7
0.00203885
NA
NA
NA
NA
CoA
CE4836
tetracosa-6,9,12,15,18-all-cis-pentaenoyl-CoA
7
0.00203885
NA
NA
NA
NA
C00201
Nucleoside triphosphate
8
0.0021019
NA
NA
NA
NA
C00222
3-Oxopropanoate
9
0.0032366
NA
NA
368
0.333549
C00447
Sedoheptulose 1,7-bisphosphate
9
0.0032366
NA
NA
NA
NA
CE2434
trans,cis,cis-2,9,12-octadecatrienoyl-CoA
10
0.00487151
NA
NA
NA
NA
CE2596
trans,cis-dodeca-2,5-dienoyl-CoA
10
0.00487151
NA
NA
NA
NA
CE2591
trans,cis-hexadeca-2,9-dienoyl-CoA
10
0.00487151
NA
NA
NA
NA
CE2594
trans,cis-myristo-2,7-dienoyl-CoA
10
0.00487151
NA
NA
NA
NA
CE2432
trans-2-cis,cis-5,8-tetradecatrienoyl-CoA
10
0.00487151
NA
NA
NA
NA
Table A.50: The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using
the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
EHMN model
Rank
P-value
C00004
Nicotinamide adenine dinucleotide - reduced
1
0.0000892173
135
0.531749
253
0.113878
C00003
Nicotinamide adenine dinucleotide
2
0.0000982315
135
0.531749
241
0.107103
C05576
3,4-Dihydroxyphenylethyleneglycol
3
0.000520894
NA
NA
74
C00132
Methanol
3
0.000520894
NA
NA
624
0.309013
C00510
Octadecenoyl-CoA (n-C18:1CoA)
4
0.00061122
66
0.269027
498
0.247786
C16218
trans-Octadec-2-enoyl-CoA
4
0.00061122
NA
NA
NA
NA
vaccenyl coenzyme A
4
0.00061122
NA
NA
NA
NA
C00577
D-Glyceraldehyde
5
0.000678521
NA
NA
4
0.00104135
C01094
D-Fructose 1-phosphate
6
0.000684636
NA
NA
3
0.000596177
D-Xylulose 1-phosphate
6
0.000684636
NA
NA
NA
C03451
(R)-S-Lactoylglutathione
7
0.000914369
NA
NA
1715
0.956589
C00051
Reduced glutathione
8
0.00132034
136
0.533874
1506
0.844639
C00072
L-Ascorbate
9
0.00163147
NA
NA
25
0.0125241
C00424
L-Lactaldehyde
0.0016341
NA
NA
NA
NA
10
0.0275416
NA
Table A.51: The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’ using
the Recon 1 model in comparison to the adipocyte and EHMN model.
A.5.3
Comparison of expression data
The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model.
94
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Adipocyte
WM
KEGG ID
WR
Metabolite name
Rank
P-value
Rank
1
0.00000292448
eicosadienoyl-CoA (C20:2CoA, n-6)
2
0.0000625222
117
octadecatrienoyl-CoA (C18:3CoA, n-3)
3
0.0000704046
3
0.00735661
O2
4
0.000162911
6
0.0144094
octadecatrienoyl-CoA (C18:3CoA, n-6)
5
0.000181388
1
0.000234023
stearidonyl coenzyme A (C18:4CoA, n-3)
5
0.000181388
1
0.000234023
Coenzyme A
6
0.000904182
42
eicosatetraenoyl-CoA (C20:4CoA, n-6)
7
0.00144942
7
0.0167658
eicosatrienoyl-CoA (C20:3CoA, n-6)
7
0.00144942
7
0.0167658
heptadecenoyl CoA (C17:1CoA, n-8)
8
0.00267006
112
0.452045
eicosenoyl-CoA (C20:1CoA, n-11)
9
0.00291506
87
0.351476
tetradecenoyl-CoA (C14:1CoA, n-5)
9
0.00291506
87
0.351476
docosenoyl-CoA (C22:1CoA, n-9)
10
0.00321812
101
0.403862
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
10
0.00321812
101
0.403862
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
10
0.00321812
66
0.269027
C00007
C00010
10
P-value
octadecadienoyl-CoA (C18:2CoA, n-6)
0.0226904
0.460496
0.178629
Table A.52: The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after
LCD’ and ’WR - before LCD vs. after LCD’ based on the adipocyte model.
WR
KEGG ID
WM
Metabolite name
Rank
octadecatrienoyl-CoA (C18:3CoA, n-6)
1
0.000234023
stearidonyl coenzyme A (C18:4CoA, n-3)
1
0.000234023
eicosapentaenoyl-CoA (C20:5CoA, n-3)
2
0.00156232
16
0.0164849
eicosatetraenoyl-CoA (C20:4CoA, n-3)
2
0.00156232
16
0.0164849
octadecatrienoyl-CoA (C18:3CoA, n-3)
3
0.00735661
3
L-Homocysteine
4
0.00774941
46
0.15299
S-(Hydroxymethyl)glutathione
5
0.00786018
90
0.310086
O2
6
0.0144094
4
0.000162911
eicosatetraenoyl-CoA (C20:4CoA, n-6)
7
0.0167658
7
0.00144942
eicosatrienoyl-CoA (C20:3CoA, n-6)
7
0.0167658
7
0.00144942
C00003
Nicotinamide adenine dinucleotide
8
0.0171684
33
0.0961396
C00004
Nicotinamide adenine dinucleotide - reduced
8
0.0171684
33
0.0961396
C00235
Dimethylallyl diphosphate
9
0.0194827
239
0.841729
C00448
Farnesyl diphosphate
9
0.0194827
104
0.378231
C00341
Geranyl diphosphate
9
0.0194827
229
0.815733
C00129
Isopentenyl diphosphate
9
0.0194827
239
0.841729
octadecadienoyl-CoA (C18:2CoA, n-6)
10
0.0226904
1
C00155
C00007
P-value
Rank
P-value
5
0.000181388
5
0.000181388
0.0000704046
0.00000292448
Table A.53: The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after
LCD’ and ’WM - before LCD vs. after LCD’ based on the adipocyte model.
95
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN
WM
KEGG ID
WR
Metabolite name
Rank
P-value
C02249
Arachidonyl-CoA
1
0.00000324492
Rank
27
P-value
0.0140488
C01794
Choloyl-CoA
2
0.0000594551
386
0.186458
CE2254
docosanoyl-CoA
2
0.0000594551
605
0.298826
C00412
Stearoyl-CoA
2
0.0000594551
10
CE2257
tetracosanoyl-CoA
2
0.0000594551
739
C02050
Linoleoyl-CoA
3
0.0000701231
1
C00510
Oleoyl-CoA
4
0.000318929
498
C00154
Palmitoyl-CoA
4
0.000318929
2
C05965
12(S)-HPETE
5
0.000330223
1644
C04717
13(S)-HPODE
5
0.000330223
315
0.14657
CE2163
13-hydroxy-(9Z,11E)-octadecadienoate
5
0.000330223
315
0.14657
C04742
15(S)-HETE
5
0.000330223
1644
0.921654
C05966
15(S)-HPETE
5
0.000330223
1674
0.937448
C04805
5(S)-HETE
5
0.000330223
1773
0.987977
C05356
5(S)-HPETE
5
0.000330223
593
0.288994
C14827
9(S)-HPODE
5
0.000330223
1349
0.748942
CE2539
9-hydroxyoctadecadienoate
5
0.000330223
1251
0.699966
CE0852
palmitoleoyl-CoA
6
0.000335923
99
C03069
3-Methylcrotonyl-CoA
7
0.000899727
750
CE0713
3-oxolinoleoyl-CoA
8
0.0011461
C00332
Acetoacetyl-CoA
9
0.00153217
336
C01944
Octanoyl-CoA
9
0.00153217
1473
C00091
Succinyl-CoA
9
0.00153217
36
C00016
FAD
10
0.00159844
307
0.144276
C01352
FADH2
10
0.00159844
866
0.448917
36
0.00304985
0.371139
0.000133458
0.247786
0.000296351
0.921654
0.0385497
0.377265
0.0178061
0.15702
0.830036
0.0178061
Table A.54: The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after
LCD’ and ’WR - before LCD vs. after LCD’ based on the EHMN model.
WR
KEGG ID
WM
Metabolite name
Rank
P-value
Rank
P-value
0.0000701231
C02050
Linoleoyl-CoA
1
0.000133458
3
C00510
Oleoyl-CoA
2
0.000296351
79
C00154
Palmitoyl-CoA
2
0.000296351
4
0.000318929
C00412
Stearoyl-CoA
2
0.000296351
7
0.000456018
C01094
D-Fructose 1-phosphate
3
0.000596177
321
0.159899
C00577
D-Glyceraldehyde
4
0.00104135
1245
0.705411
C06426
(6Z,9Z,12Z)-Octadecatrienoic acid
5
0.00111288
38
0.00776122
C03035
gamma-Linolenoyl-CoA
5
0.00111288
38
0.00776122
CE4815
stearidonoyl-CoA
5
0.00111288
207
0.0863126
CE4824
tetracosa-9,12,15,18,21-all-cis-pentaenoyl-CoA
5
0.00111288
207
0.0863126
CE4837
tetracosa-9,12,15,18-all-cis-tetraenoyl-CoA
5
0.00111288
207
0.0863126
C02249
Arachidonyl-CoA
6
0.00121838
94
0.0290386
CE4809
alpha-linolenoyl-CoA
7
0.00203885
167
0.065261
CE4823
tetracosa-6,9,12,15,18,21-all-cis-hexaenoyl-CoA
7
0.00203885
167
0.065261
CE4836
tetracosa-6,9,12,15,18-all-cis-pentaenoyl-CoA
7
0.00203885
167
0.065261
C00201
Nucleoside triphosphate
8
0.0021019
1407
0.779159
C00222
3-Oxopropanoate
9
0.0032366
401
0.205611
C00447
Sedoheptulose 1,7-bisphosphate
9
0.0032366
401
0.205611
CE2434
trans,cis,cis-2,9,12-octadecatrienoyl-CoA
10
0.00487151
26
0.0049764
CE2596
trans,cis-dodeca-2,5-dienoyl-CoA
10
0.00487151
26
0.0049764
CE2591
trans,cis-hexadeca-2,9-dienoyl-CoA
10
0.00487151
26
0.0049764
CE2594
trans,cis-myristo-2,7-dienoyl-CoA
10
0.00487151
26
0.0049764
CE2432
trans-2-cis,cis-5,8-tetradecatrienoyl-CoA
10
0.00487151
107
0.0193181
0.034329
Table A.55: The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after
LCD’ and ’WM - before LCD vs. after LCD’ based on the EHMN model.
96
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Recon 1
WM
KEGG ID
WR
Metabolite name
Rank
P-value
Rank
P-value
C00510
Octadecenoyl-CoA (n-C18:1CoA)
1
0.00000502544
4
C16218
trans-Octadec-2-enoyl-CoA
1
0.00000502544
529
0.503971
vaccenyl coenzyme A
1
0.00000502544
529
0.503971
tetracosahexaenoyl coenzyme A
2
0.000033348
19
Coenzyme A
3
0.000246448
237
0.213742
alpha-Linolenoyl-CoA
4
0.000370386
529
0.503971
gamma-linolenoyl-CoA
4
0.000370386
529
0.503971
linoelaidyl coenzyme A
4
0.000370386
529
0.503971
linoleic coenzyme A
4
0.000370386
529
0.503971
stearidonyl coenzyme A
4
0.000370386
529
0.503971
tetracosapentaenoyl coenzyme A, n-3
4
0.000370386
195
0.173053
C05272
Hexadecenoyl-CoA (n-C16:1CoA)
5
0.00113402
647
0.601787
C00003
Nicotinamide adenine dinucleotide
6
0.00127546
2
C00422
triacylglycerol (homo sapiens)
7
0.00127821
343
C00004
Nicotinamide adenine dinucleotide - reduced
8
0.0014729
C00268
6,7-Dihydrobiopterin
9
0.00157729
4-Nitrophenyl sulfate
10
0.0019562
23
0.0109026
Dopamine 3-O-sulfate
10
0.0019562
23
0.0109026
C00010
C03035
C16163
C13690
1
871
0.00061122
0.0089341
0.0000982315
0.310791
0.0000892173
0.830442
Table A.56: The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after
LCD’ and ’WR - before LCD vs. after LCD’ based on the Recon 1 model.
WR
KEGG ID
WM
Metabolite name
Rank
P-value
Rank
8
0.0014729
P-value
6
0.00127546
C00004
Nicotinamide adenine dinucleotide - reduced
1
0.0000892173
C00003
Nicotinamide adenine dinucleotide
2
0.0000982315
C05576
3,4-Dihydroxyphenylethyleneglycol
3
0.000520894
130
0.110313
C00132
Methanol
3
0.000520894
130
0.110313
C00510
Octadecenoyl-CoA (n-C18:1CoA)
4
0.00061122
1
C16218
trans-Octadec-2-enoyl-CoA
4
0.00061122
415
0.40608
vaccenyl coenzyme A
4
0.00061122
415
0.40608
C00577
D-Glyceraldehyde
5
0.000678521
647
0.619548
C01094
D-Fructose 1-phosphate
6
0.000684636
172
0.159806
D-Xylulose 1-phosphate
6
0.000684636
172
0.159806
C03451
(R)-S-Lactoylglutathione
7
0.000914369
607
0.5852
C00051
Reduced glutathione
8
0.00132034
68
C00072
L-Ascorbate
9
0.00163147
664
0.62973
C00424
L-Lactaldehyde
0.0016341
127
0.107009
10
0.00000502544
0.048087
Table A.57: The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after
LCD’ and ’WM - before LCD vs. after LCD’ based on the Recon 1 model.
97
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.6
Characterization of the initial molecular events of adipose
tissue development and growth during overfeeding in humans (GSE28005)
Healthy lean and overweight subjects were submitted to a high fat diet during 56 days [13]:
- 18 probands
- 3 paired samples per proband, taken at day 0, day 14, day 56
The following comparisons were applied for the calculation of the differentially expressed genes:
(i) day 0 vs. day 14 and (ii) day 0 vs. day 56.
A.6.1
Differentially expressed genes
The following tables show the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Day 0 vs. day 14
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
4739
NM 001142393
2
neural precursor
NM 006403
developmentally
NM 182966
down-regulated 9
6954
NM 001093728
2.34864686333649e-06
cell expressed,
t-complex 11
2.13866148594628e-05
homolog (mouse)
NM 018679
3
1490
NM 001901
4
286002
XM 001715026
connective tissue
2.4394198929303e-05
growth factor
hypothetical prot-
2.66722001916861e-05
ein LOC286002
XM 001718146
XM 001718326
5
253635
NM 174931
6
30814
NM 014589
coiled-coil domain
3.88394993829332e-05
containing 75
phospholipase A2,
9.85819950741516e-05
group IIE
hsa00564: Glycerophospho-
C00093
lipid metabolism
hsa00565: Ether lipid
metabolism
hsa00590: Arachidonic acid
metabolism
hsa00591: Linoleic acid
metabolism
hsa00592: alpha-Linolenic
acid metabolism
hsa01100: Metabolic
C00010
C00008
C00001
pathways
C00026
C00018
C00083
C00041
C00024
C00097
C00083
C00026
C00164
C00097
C00093
C00249
C00164
C00314
C00422
C00249
C00534
C01944
C03373
C00627
C03373
C05272
C00647
C03373
98
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
hsa04010: MAPK signaling
pathway
hsa04270: Vascular smooth
muscle contraction
hsa04370: VEGF signaling
pathway
hsa04664: Fc epsilon RI
signaling pathway
hsa04724: Glutamatergic
synapse
hsa04726: Serotonergic
synapse
hsa04730: Long-term
depression
hsa04912: GnRH signaling
pathway
hsa04972: Pancreatic
C00001
secretion
hsa04975: Fat digestion
C00010
C00422
and absorption
hsa05145: Toxoplasmosis
7
79695
NM 024642
UDP-N-acetyl-
0.000115028315666452
hsa00512: Mucin type
alpha-D-
O-Glycan biosynthesis
galactosamine:
hsa01100: Metabolic
C00010
C00008
C00001
polypeptide
pathways
C00026
C00018
C00083
N-acetyl-
C00041
C00024
C00097
galactosaminyl-
C00083
C00026
C00164
transferase 12
C00097
C00093
C00249
(GalNAc-T12)
C00164
C00314
C00422
C00249
C00534
C01944
C03373
C00627
C03373
C05272
C00647
C03373
8
401068
hypothetical gene
XR 041577
supported by
XR 041578
BC028186
0.000122753694924759
XR 041578
9
84649
NM 032564
diacylglycerol
0.000145614029643437
hsa00561: Glycerolipid
O-acyltransferase
metabolism
homolog 2 (mouse)
hsa01100: Metabolic
pathways
C00093
C00422
C00010
C00008
C00001
C00026
C00018
C00083
C00041
C00024
C00097
C00083
C00026
C00164
C00097
C00093
C00249
C00164
C00314
C00422
C00249
C00534
C01944
C03373
C00627
C03373
C05272
C00647
C03373
hsa04975: Fat digestion
C00010
C00422
and absorption
10
166979
NM 152623
cell division cycle
0.000172732888620482
20 homolog B
(S. cerevisiae)
Table A.58: The top 10 differentially expressed genes from the comparison ’day 0 vs. day 14’ with the
corresponding pathways and reporter metabolites.
99
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Day 0 vs. day 56
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
1
9843
hephaestin
1.40947289382373e-06
NM 001130860
hsa04978: Mineral
absorption
NM 014799
NM 138737
2
2115
ets variant 1
2.50155485453957e-06
1535
cytochrome b-245,
4.28785657794403e-06
NM 000101
alpha polypeptide
NM 004956
3
hsa05202: Transcriptional
misregulation in cancer
hsa04145: Phagosome
C00007
C00027
C00704
hsa04380: Osteoclast
C00027
differentiation
hsa04670: Leukocyte trans-
C00027
endothelial migration
hsa05140: Leishmaniasis
4
441024
NM 001004346
methylenetetra-
4.44745625822843e-06
hsa00670: One carbon
hydrofolate
pool by folate
dehydrogenase
hsa01100: Metabolic
C00005
C00002
C00154
(NADP+ depen-)
pathways
C00006
C00008
C00164
C00007
C00060
C05272
C00092
C00164
C00122
C00257
C00164
C00426
dent 2-like
C00364
C01061
C00365
G00159
G00160
5
1278
NM 000089
collagen, type I,
4.80281529509077e-06
alpha 2
hsa04510: Focal adhesion
hsa04512: ECM-receptor
interaction
hsa04974: Protein digestion
and absorption
hsa05146: Amoebiasis
6
7
6423
secreted frizzled-
NM 003013
related protein 2
10644
NM 001007225
8
insulin-like growth
C00027
hsa04310: Wnt signaling
pathway
1.15857882753486e-05
factor 2 mRNA
NM 006548
binding protein 2
27115
phosphodiesterase
NM 018945
5.99850512222573e-06
1.24992059917824e-05
7B
hsa00230: Purine
C00002
metabolism
C00008
hsa05032: Morphine
addiction
9
7140
NM 001042780
troponin T type 3
1.43295940851921e-05
(skeletal, fast)
NM 001042781
NM 001042782
NM 006757
10
54829
asporin
1.45832184738187e-05
NM 017680
Table A.59: The top 10 differentially expressed genes from the comparison ’day 0 vs. day 56’ with the
corresponding pathways and reporter metabolites.
A.6.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
100
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Day 0 vs. day 14
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
1
0.0031377
C00041
L-Alanine
2
0.00446266
1183
C00097
L-Cysteine
2
0.00446266
26
0.0165483
4
C00164
Acetoacetate
3
0.00471785
67
0.0335582
823
0.648248
eicosadienoyl-CoA (C20:2CoA, n-6)
4
0.00673465
NA
NA
NA
NA
dodecanoate (C12:0)
5
0.00717003
35
0.0208393
136
0.106821
Eicosanoate (n-C20:0)
5
0.00717003
NA
NA
NA
NA
C00249
hexadecanoate (n-C16:0)
5
0.00717003
1227
0.605387
591
0.447948
C01530
octadecanoate (n-C18:0)
5
0.00717003
382
0.179519
259
0.194384
pentadecanoate (C15:0)
5
0.00717003
NA
NA
NA
NA
tetradecanoate (C14:0)
5
0.00717003
35
0.0208393
59
0.0363937
docosenoyl-CoA (C22:1CoA, n-9)
6
0.00736875
NA
NA
NA
NA
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
6
0.00736875
470
0.231362
1195
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
6
0.00736875
667
0.332364
213
C03373
5-amino-1-(5-phospho-D-ribosyl)imidazole
7
0.00862875
4
C00026
2-Oxoglutarate
8
0.0120078
1317
0.649452
522
0.395589
sn-Glycerol 3-phosphate
9
0.0135386
880
0.430031
NA
NA
0.015976
155
0.0728419
C00010
Coenzyme A
10
0.583894
0.00475059
10
P-value
Malonyl-CoA
C06424
0.0101903
Recon 1 model
Rank
C00083
C02679
12
P-value
0.00818013
154
0.121584
0.00216723
0.96341
0.163803
9
0.00590893
1003
0.796107
Table A.60: The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the adipocyte
model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C00024
Acetyl-CoA
1
0.00143499
227
0.669832
281
0.211005
CE5166
25(S)-trihydroxycoprostanoyl-CoA
2
0.00148815
NA
NA
NA
NA
C00093
sn-Glycerol 3-phosphate
3
0.00199595
NA
NA
44
0.0261254
C03373
Aminoimidazole ribotide
4
0.00475059
7
C03125
L-Cysteinyl-tRNA(Cys)
5
0.00489944
NA
NA
NA
NA
C01639
tRNA(Cys)
5
0.00489944
NA
NA
NA
NA
C00026
2-Oxoglutarate
6
0.00542615
217
0.636648
522
0.395589
C02249
Arachidonyl-CoA
7
0.00641238
NA
NA
826
0.651886
C00018
Pyridoxal phosphate
8
0.00698159
NA
NA
NA
NA
C00534
Pyridoxamine
8
0.00698159
NA
NA
212
0.16237
C00647
Pyridoxamine phosphate
8
0.00698159
NA
NA
NA
NA
C00314
Pyridoxine
8
0.00698159
NA
NA
212
0.16237
C00627
Pyridoxine phosphate
8
0.00698159
NA
NA
NA
NA
C03721
Protein tyrosine-O-sulfate
9
0.00761396
NA
NA
NA
NA
C00008
ADP
10
0.00790762
211
0.600735
40
0.0244097
0.00862875
9
0.00590893
Table A.61: The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the EHMN
model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
45
P-value
0.1134
EHMN model
Rank
L-Cysteine
1
0.000527156
C00422
triacylglycerol (homo sapiens)
2
0.000885051
NA
NA
C00164
Acetoacetate
3
0.00135937
240
0.753101
67
C00001
H2O
4
0.00302208
114
0.323002
381
0.178984
C19586
8,9 epxoy aflatoxin B1
5
0.00329827
NA
NA
NA
NA
C06800
aflatoxin B1
5
0.00329827
NA
NA
NA
NA
C14497
6 beta hydroxy testosterone
6
0.00346225
NA
NA
1739
0.841028
C00249
Hexadecanoate (n-C16:0)
7
0.0046149
5
0.00717003
1227
0.605387
C03373
5-amino-1-(5-phospho-D-ribosyl)imidazole
8
0.00590893
7
0.00862875
4
C00083
Malonyl-CoA
9
0.00818013
1
0.0031377
C01944
Octanoyl-CoA (n-C8:0CoA)
10
0.00924781
115
0.324839
26
P-value
C00097
697
12
413
0.0165483
0.349489
0.0335582
0.00475059
0.0101903
0.19678
Table A.62: The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the Recon 1
model in comparison to the adipocyte and EHMN model.
101
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Day 0 vs. day 56
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
C00164
Acetoacetate
1
0.00133727
89
C01342
Ammonium
2
0.00149079
C00122
Fumarate
3
0.0141678
C00092
D-Glucose 6-phosphate
4
C00365
dUMP
P-value
Recon 1 model
Rank
P-value
0.0521501
472
0.351164
1899
0.892111
504
0.376983
209
0.110517
138
0.10333
0.014582
190
0.101862
682
0.527185
5
0.0159453
143
0.0767043
198
0.136195
heptadecenoyl CoA (C17:1CoA, n-8)
6
0.0259099
NA
NA
NA
NA
C00006
Nicotinamide adenine dinucleotide phosphate
7
0.0290297
1263
0.599895
1079
0.867003
C00005
Nicotinamide adenine dinucleotide phosphate
7
0.0290297
498
0.256139
1100
0.882603
dTMP
8
0.0359442
1732
0.820842
36
0.0295632
eicosadienoyl-CoA (C20:2CoA, n-6)
9
0.0373189
NA
NA
NA
NA
- reduced
C00364
C00027
Hydrogen peroxide
10
0.0392387
1989
0.937971
406
0.292518
C00007
O2
10
0.0392387
192
0.102931
638
0.494795
C00704
Superoxide
10
0.0392387
421
0.219928
566
0.425701
Table A.63: The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the adipocyte
model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
G00159
(Gal)2 (GalNAc)1 (GlcA)2 (Xyl)1 (Ser)1
1
0.00119035
NA
NA
NA
NA
CN0004
IdoAbeta1-3GalNAcbeta1-4IdoAbeta1-
1
0.00119035
NA
NA
NA
NA
3GalNAcbeta1-4GlcAbeta1-3Galbeta13Galbeta1-4Xylbeta1-Ser-peptide
C01777
Acylcholine
2
0.00133707
NA
NA
NA
NA
C00060
Carboxylate
2
0.00133707
NA
NA
NA
NA
G00160
(Gal)2 (GalNAc)2 (GlcA)2 (Xyl)1 (Ser)1
3
0.00240525
NA
NA
826
0.635433
CN0005
Chondroitin sulfate C
3
0.00240525
NA
NA
NA
NA
CN0006
Chondroitin sulfate D
3
0.00240525
NA
NA
NA
NA
CN0007
Chondroitin sulfate E
3
0.00240525
NA
NA
NA
NA
C00426
Dermatan sulfate
3
0.00240525
NA
NA
NA
NA
CN0002
GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-
3
0.00240525
NA
NA
NA
NA
3
0.00240525
NA
NA
NA
NA
3
0.00240525
NA
NA
NA
NA
4GlcAbeta1-3Galbeta1-3Galbeta1-4Xylbeta1Ser-peptide
CN0003
GlcAbeta1-3GalNAcbeta1-4IdoAbeta13GalNAcbeta1-4GlcAbeta1-3Galbeta13Galbeta1-4Xylbeta1-Ser-peptide
CN0001
IdoAbeta1-3GalNAcbeta1-4GlcAbeta13Galbeta1-3Galbeta1-4Xylbeta1-Ser-peptide
C00164
Acetoacetate
4
0.00298485
179
0.562478
472
0.351164
C00008
ADP
5
0.00414578
25
0.0721646
725
0.559198
C00002
ATP
5
0.00414578
88
0.220569
493
0.364706
CE5869
lysyl-proline
6
0.00477995
NA
NA
NA
NA
CE5868
N-acetyl-seryl-aspartate
6
0.00477995
NA
NA
NA
NA
CE5867
N-acetyl-seryl-aspartyl-lysyl-proline
6
0.00477995
NA
NA
NA
NA
C01061
4-Fumarylacetoacetate
7
0.00491131
52
0.123833
152
0.110522
CE0852
palmitoleoyl-CoA
8
0.0073061
NA
NA
NA
NA
CE5787
kinetensin 1-3
9
0.00811106
NA
NA
NA
NA
C00257
D-Gluconic acid
10
0.00922411
NA
NA
NA
NA
Table A.64: The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the EHMN
model in comparison to the adipocyte and Recon 1 model.
102
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Recon1 model
KEGG ID
C00164
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Acetoacetate
1
0.00034461
179
0.562478
EHMN model
Rank
P-value
89
0.0521501
NA
NA
NA
0.159726
126
0.0689563
R total 3 Coenzyme A
2
0.000626834
NA
C00510
Octadecenoyl-CoA (n-C18:1CoA)
3
0.000677739
71
C16218
trans-Octadec-2-enoyl-CoA
3
0.000677739
NA
NA
NA
NA
vaccenyl coenzyme A
3
0.000677739
NA
NA
NA
NA
C05272
Hexadecenoyl-CoA (n-C16:1CoA)
4
0.00142397
16
0.0515015
C00412
Stearoyl-CoA (n-C18:0CoA)
5
0.00144629
71
0.159726
alpha-Linolenoyl-CoA
6
0.0020225
NA
NA
208
0.11006
linoelaidyl coenzyme A
6
0.0020225
NA
NA
NA
NA
linoleic coenzyme A
6
0.0020225
NA
NA
NA
NA
tetracosapentaenoyl coenzyme A, n-3
6
0.0020225
NA
NA
NA
NA
C01342
Ammonium
7
0.0032767
185
0.576777
1899
C00154
Palmitoyl-CoA (n-C16:0CoA)
8
0.00335413
102
0.265511
151
0.0798932
arachidyl coenzyme A
9
0.00409846
NA
NA
NA
NA
cervonyl coenzyme A
9
0.00409846
NA
NA
NA
NA
docosa-4,7,10,13,16-pentaenoyl coenzyme A
9
0.00409846
NA
NA
NA
NA
heptadecanoyl coa
9
0.00409846
NA
NA
NA
NA
Hexacosanoyl-CoA (n-C26:0CoA)
9
0.00409846
NA
NA
NA
NA
lignocericyl coenzyme A
9
0.00409846
NA
NA
NA
NA
nervonyl coenzyme A
9
0.00409846
NA
NA
NA
NA
pentadecanoyl Coenzyme A
9
0.00409846
NA
NA
NA
NA
tetracosapentaenoyl coenzyme A, n-6
9
0.00409846
NA
NA
NA
NA
C16171
tetracosatetraenoyl coenzyme A
9
0.00409846
NA
NA
NA
NA
C01211
Procollagen 5-hydroxy-L-lysine
10
0.00797606
NA
NA
1897
C16740
Procollagen L-lysine
10
0.00797606
NA
NA
NA
C16173
1194
24
0.570979
0.0162869
0.892111
0.890457
NA
Table A.65: The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the Recon 1
model in comparison to the adipocyte and EHMN model.
A.6.3
Comparison of expression data
The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model.
Adipocyte
day 0 vs. day 14
KEGG ID
day 0 vs. day 56
Metabolite name
Rank
P-value
Rank
P-value
C00083
Malonyl-CoA
1
0.0031377
69
0.157038
C00041
L-Alanine
2
0.00446266
46
0.113193
C00097
L-Cysteine
2
0.00446266
99
0.25144
C00164
Acetoacetate
3
0.00471785
179
eicosadienoyl-CoA (C20:2CoA, n-6)
4
0.00673465
9
0.0373189
dodecanoate (C12:0)
5
0.00717003
32
0.0852161
Eicosanoate (n-C20:0)
5
0.00717003
32
0.0852161
C00249
hexadecanoate (n-C16:0)
5
0.00717003
32
0.0852161
C01530
octadecanoate (n-C18:0)
5
0.00717003
53
0.124034
pentadecanoate (C15:0)
5
0.00717003
32
0.0852161
tetradecanoate (C14:0)
5
0.00717003
32
0.0852161
docosenoyl-CoA (C22:1CoA, n-9)
6
0.00736875
16
0.0515015
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
6
0.00736875
16
0.0515015
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
6
0.00736875
71
0.159726
C03373
5-amino-1-(5-phospho-D-ribosyl)imidazole
7
0.00862875
37
0.0937082
C00026
2-Oxoglutarate
8
0.0120078
42
0.101344
sn-Glycerol 3-phosphate
9
0.0135386
17
0.0534647
0.015976
65
0.152882
C02679
C06424
C00010
Coenzyme A
10
0.562478
Table A.66: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0
vs. day 56’ based on the adipocyte model.
103
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
day 0 vs. day 56
KEGG ID
day 0 vs. day 14
Metabolite name
Rank
P-value
Rank
P-value
C00164
Acetoacetate
1
0.00133727
240
0.753101
C01342
Ammonium
2
0.00149079
210
0.595094
C00122
Fumarate
3
0.0141678
40
0.0995445
C00092
D-Glucose 6-phosphate
4
0.014582
99
0.285809
C00365
dUMP
5
0.0159453
175
0.498916
heptadecenoyl CoA (C17:1CoA, n-8)
6
0.0259099
20
C00006
Nicotinamide adenine dinucleotide phosphate
7
0.0290297
194
0.543631
C00005
Nicotinamide adenine dinucleotide phosphate - reduced
7
0.0290297
194
0.543631
C00364
dTMP
8
0.0359442
233
0.707671
eicosadienoyl-CoA (C20:2CoA, n-6)
9
0.0373189
4
0.0372373
0.00673465
C00027
Hydrogen peroxide
10
0.0392387
230
0.698221
C00007
O2
10
0.0392387
107
0.30699
C00704
Superoxide
10
0.0392387
31
0.0671406
Table A.67: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0
vs. day 14’ based on the adipocyte model.
EHMN
day 0 vs. day 14
KEGG ID
day 0 vs. day 56
Metabolite name
Rank
P-value
Rank
P-value
C00024
Acetyl-CoA
1
0.00143499
19
0.0139786
CE5166
25(S)-trihydroxycoprostanoyl-CoA
2
0.00148815
191
0.102176
C00093
sn-Glycerol 3-phosphate
3
0.00199595
369
0.192557
C03373
Aminoimidazole ribotide
4
0.00475059
168
0.0897013
C03125
L-Cysteinyl-tRNA(Cys)
5
0.00489944
174
0.0946223
C01639
tRNA(Cys)
5
0.00489944
174
0.0946223
C00026
2-Oxoglutarate
6
0.00542615
535
0.274675
C02249
Arachidonyl-CoA
7
0.00641238
23
0.0161232
C00018
Pyridoxal phosphate
8
0.00698159
15
0.0109335
C00534
Pyridoxamine
8
0.00698159
15
0.0109335
C00647
Pyridoxamine phosphate
8
0.00698159
15
0.0109335
C00314
Pyridoxine
8
0.00698159
15
0.0109335
C00627
Pyridoxine phosphate
8
0.00698159
15
0.0109335
C03721
Protein tyrosine-O-sulfate
9
0.00761396
33
0.0188186
C00008
ADP
10
0.00790762
625
0.320019
Table A.68: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0
vs. day 56’ based on the EHMN model.
104
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
day 0 vs. day 56
KEGG ID
day 0 vs. day 14
Metabolite name
Rank
P-value
Rank
P-value
G00159
(Gal)2 (GalNAc)1 (GlcA)2 (Xyl)1 (Ser)1
1
0.00119035
1045
0.513807
CN0004
IdoAbeta1-3GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-4GlcAbeta1-
1
0.00119035
1045
0.513807
3Galbeta1-3Galbeta1-4Xylbeta1-Ser-peptide
C01777
Acylcholine
2
0.00133707
1987
0.947088
C00060
Carboxylate
2
0.00133707
1282
0.62891
G00160
(Gal)2 (GalNAc)2 (GlcA)2 (Xyl)1 (Ser)1
3
0.00240525
1346
0.663502
CN0005
Chondroitin sulfate C
3
0.00240525
1346
0.663502
CN0006
Chondroitin sulfate D
3
0.00240525
1346
0.663502
CN0007
Chondroitin sulfate E
3
0.00240525
1346
0.663502
C00426
Dermatan sulfate
3
0.00240525
1346
0.663502
CN0002
GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-4GlcAbeta1-3Galbeta1-
3
0.00240525
1346
0.663502
3
0.00240525
1819
0.879743
3
0.00240525
1572
0.759935
3Galbeta1-4Xylbeta1-Ser-peptide
CN0003
GlcAbeta1-3GalNAcbeta1-4IdoAbeta1-3GalNAcbeta1-4GlcAbeta13Galbeta1-3Galbeta1-4Xylbeta1-Ser-peptide
CN0001
IdoAbeta1-3GalNAcbeta1-4GlcAbeta1-3Galbeta1-3Galbeta1-4Xylbeta1Ser-peptide
C00164
Acetoacetate
4
0.00298485
67
0.0335582
C00008
ADP
5
0.00414578
40
0.0224009
C00002
ATP
5
0.00414578
147
0.0694499
CE5869
lysyl-proline
6
0.00477995
405
0.18819
CE5868
N-acetyl-seryl-aspartate
6
0.00477995
405
0.18819
CE5867
N-acetyl-seryl-aspartyl-lysyl-proline
6
0.00477995
405
0.18819
C01061
4-Fumarylacetoacetate
7
0.00491131
2019
0.959752
CE0852
palmitoleoyl-CoA
8
0.0073061
369
0.175351
CE5787
kinetensin 1-3
9
0.00811106
1277
0.627053
C00257
D-Gluconic acid
10
0.00922411
353
0.165674
Table A.69: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0
vs. day 14’ based on the EHMN model.
Recon 1
day 0 vs. day 14
KEGG ID
day 0 vs. day 56
Metabolite name
Rank
P-value
Rank
P-value
C00097
L-Cysteine
1
0.000527156
425
0.307966
C00422
triacylglycerol (homo sapiens)
2
0.000885051
1192
0.959725
C00164
Acetoacetate
3
0.00135937
472
0.351164
C00001
H2O
4
0.00302208
458
0.333773
C19586
8,9 epxoy aflatoxin B1
5
0.00329827
339
0.234191
C06800
aflatoxin B1
5
0.00329827
339
0.234191
C14497
6 beta hydroxy testosterone
6
0.00346225
333
0.230306
C00249
Hexadecanoate (n-C16:0)
7
0.0046149
391
0.283537
C03373
5-amino-1-(5-phospho-D-ribosyl)imidazole
8
0.00590893
108
0.0838115
C00083
Malonyl-CoA
9
0.00818013
367
0.264933
C01944
Octanoyl-CoA (n-C8:0CoA)
10
0.00924781
284
0.194032
Table A.70: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day 0
vs. day 56’ based on the Recon 1 model.
105
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
day 0 vs. day 56
KEGG ID
C00164
day 0 vs. day 14
Metabolite name
Rank
P-value
Rank
P-value
Acetoacetate
1
0.00034461
823
0.648248
R total 3 Coenzyme A
2
0.000626834
240
0.183677
C00510
Octadecenoyl-CoA (n-C18:1CoA)
3
0.000677739
213
0.163803
C16218
trans-Octadec-2-enoyl-CoA
3
0.000677739
280
0.210177
vaccenyl coenzyme A
3
0.000677739
280
0.210177
C05272
Hexadecenoyl-CoA (n-C16:1CoA)
4
0.00142397
1195
C00412
Stearoyl-CoA (n-C18:0CoA)
5
0.00144629
113
0.0831994
alpha-Linolenoyl-CoA
6
0.0020225
280
0.210177
linoelaidyl coenzyme A
6
0.0020225
280
0.210177
linoleic coenzyme A
6
0.0020225
280
0.210177
tetracosapentaenoyl coenzyme A, n-3
6
0.0020225
357
0.276244
C01342
Ammonium
7
0.0032767
958
0.757085
C00154
Palmitoyl-CoA (n-C16:0CoA)
8
0.00335413
125
0.0939063
arachidyl coenzyme A
9
0.00409846
826
0.651886
cervonyl coenzyme A
9
0.00409846
826
0.651886
docosa-4,7,10,13,16-pentaenoyl coenzyme A
9
0.00409846
826
0.651886
heptadecanoyl coa
9
0.00409846
280
0.210177
Hexacosanoyl-CoA (n-C26:0CoA)
9
0.00409846
357
0.276244
lignocericyl coenzyme A
9
0.00409846
357
0.276244
nervonyl coenzyme A
9
0.00409846
357
0.276244
pentadecanoyl Coenzyme A
9
0.00409846
280
0.210177
tetracosapentaenoyl coenzyme A, n-6
9
0.00409846
357
0.276244
C16171
tetracosatetraenoyl coenzyme A
9
0.00409846
357
0.276244
C01211
Procollagen 5-hydroxy-L-lysine
10
0.00797606
294
0.226234
C16740
Procollagen L-lysine
10
0.00797606
294
0.226234
C16173
0.96341
Table A.71: The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day 0
vs. day 14’ based on the Recon 1 model.
106
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.7
Hypoxia-induced modulation of gene expression in human
adipocytes (GSE34007)
Human adipocytes (Zen-bio cells) were incubated in hypoxic conditions (1% O2 ) for 24 h. Control
human adipocytes were incubated under normoxic conditions (21% O2 ) [94].
The comparison of normoxic against hypoxic conditions is used for the calculation of the differentially
expressed genes.
A.7.1
Differentially expressed genes
The following table shows the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Normoxic vs. hypoxic conditions
Rank
EntrezID
GeneName
P-value
Pathway
egl nine homolog 3
1.14428553605755e-09
Adipocyte
EHMN
Recon1
hsa05200: Pathways in
C00122
C00149
C00149
cancer
C00149
C01245
hsa05211: Renal cell
C00122
C00149
C00149
carcinoma
C00149
C00301
RefSeqID
1
112399
NM 022073
2
5138
NM 002599
(C. elegans)
phosphodiesterase
hsa00230: Purine
C00002
C00059
2A, cGMP-
1.42708417065616e-09
metabolism
C00008
C00301
stimulated
hsa05032: Morphine addiction
3
6927
HNF1 homeobox A
1.96590456276061e-09
metallothionein 3
3.565624773826e-09
chemokine (C-X-C
4.24736720895016e-09
NM 000545
4
4504
hsa04950: Maturity onset
diabetes of the young
NM 005954
5
7852
NM 001008540
motif) receptor 4
hsa04060: Cytokinecytokine receptor
interaction
hsa04062: Chemokine
signaling pathway
hsa04144: Endocytosis
C01245
C00002
C00008
hsa04360: Axon guidance
hsa04670: Leukocyte transendothelial migration
hsa04672: Intestinal
immune network for
IgA production
6
146439
7
768
coiled-coil domain
8.16911348924392e-09
containing 64B
NM 001216
8
54210
NM 018643
carbonic
1.03629888901659e-08
anhydrase IX
triggering receptor
1.59408005433769e-08
expressed on
myeloid cells 1
9
362
aquaporin 5
1.65740906047528e-08
NM 001651
10
5055
NM 001143818
hsa04970: Salivary
C01245
C01330
secretion
serpin peptidase
2.30141016000521e-08
hsa05146: Amoebiasis
C01245
inhibitor, clade B
(ovalbumin), member 2
Table A.72: The top 10 differentially expressed genes from the comparison ’normoxic vs. hypoxic
conditions’ with the corresponding pathways and reporter metabolites.
107
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.7.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
Normoxic vs. hypoxic conditions
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
0.0109453
Recon 1 model
Rank
P-value
C00606
3-Sulfino-L-alanine
1
0.00681743
26
33
0.0257915
C00445
5,10-Methenyltetrahydrofolate
2
0.00728798
1309
C00143
5,10-Methylenetetrahydrofolate
3
0.00990644
67
0.637251
107
0.0754446
0.0270489
198
C00149
L-Malate
4
0.0108091
96
0.139875
0.0375185
3
C00079
L-Phenylalanine
5
0.0147974
1091
0.522332
512
0.416332
C00078
L-Tryptophan
5
0.0147974
1363
0.658911
763
0.653734
C00082
L-Tyrosine
5
0.0147974
1363
0.658911
763
0.653734
C00049
L-Aspartate
6
0.0171947
607
0.262324
7
C00234
10-Formyltetrahydrofolate
7
0.019219
3
C00058
Formate
7
0.019219
1031
C01107
(R)-5-Phosphomevalonate
8
0.0236819
38
C00008
ADP
8
0.0236819
514
0.215428
277
0.198629
C00002
ATP
8
0.0236819
961
0.454433
200
0.142798
C00122
Fumarate
9
0.0290069
1501
0.718153
48
0.0371757
C00864
(R)-Pantothenate
10
0.0300113
64
0.0253678
37
0.0290031
0.00164668
0.491965
0.0172219
0.00385016
0.00724726
89
0.0615075
750
0.648936
21
0.0199861
Table A.73: The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using
the adipocyte model in comparison to the EHMN and Recon 1 model.
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C00094
Sulfite
1
0.00129194
NA
NA
C00301
ADPribose
2
0.00152273
NA
NA
C00234
10-Formyltetrahydrofolate
3
0.00164668
7
C11555
1D-myo-Inositol 1,4,5,6-tetrakisphosphate
4
0.0026786
NA
NA
681
0.58955
C01245
D-myo-Inositol 1,4,5-trisphosphate
4
0.0026786
NA
NA
708
0.619569
C02249
Arachidonyl-CoA
5
0.00275275
NA
NA
404
0.308764
C14819
Fe3+
6
0.00276528
NA
NA
832
0.703891
C00059
Sulfate
7
0.00278386
NA
NA
826
0.700155
C02939
3-Methylbutanoyl-CoA
8
0.00313494
208
0.701614
68
C00025
L-Glutamate
9
0.00338359
215
0.722205
303
C00149
(S)-Malate
10
0.00378207
4
0.019219
0.0108091
2
0.00355222
5
0.00539267
89
0.0615075
0.0519052
0.214459
3
0.00385016
Table A.74: The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using
the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
C00301
ADPribose
1
0.00200643
NA
NA
C00094
Sulfite
2
0.00355222
NA
NA
C00149
L-Malate
3
0.00385016
4
C00500
Biliverdin
4
0.00462645
C00153
Nicotinamide
5
C00003
Nicotinamide adenine dinucleotide
C00006
EHMN model
Rank
12
1
P-value
0.0042943
0.00129194
0.0108091
96
0.0375185
NA
NA
22
0.0101831
0.00539267
NA
NA
655
0.28395
5
0.00539267
179
0.556851
167
0.0659428
Nicotinamide adenine dinucleotide phosphate
5
0.00539267
26
0.0542526
562
0.239822
C00399
Ubiquinone-10
6
0.00604565
NA
C00049
L-Aspartate
7
0.00724726
6
C01330
Sodium
8
0.00877595
93
C00010
Coenzyme A
9
0.010932
34
C00237
Carbon monoxide
10
0.0109933
NA
NA
21
0.00911377
C00023
Fe2+
10
0.0109933
NA
NA
NA
NA
NA
0.0171947
16
0.0073437
607
0.262324
0.293479
2104
0.992782
0.0766474
1312
0.639169
Table A.75: The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’ using
the Recon 1 model in comparison to the adipocyte and EHMN model.
108
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
A.8
Differential gene expression in adipose tissue from obese
human subjects during weight loss and weight maintenance (GSE35411)
Low calorie diet (LCD) containing 1200 kcal/day for three months. Following the weight reduction
phase for six month follow-up period [95]:
- 3 paired samples per proband, taken at baseline, after weight reduction, after weight maintenance
phase
The following comparisons were applied for the calculation of the differentially expressed genes:
(i) baseline vs. after weight reduction and (ii) baseline vs. after weight maintenance phase
A.8.1
Differentially expressed genes
The following tables show the top 10 differentially expressed genes, the pathways they are involved in,
and those top 10 reporter metabolites, which are also involved in these pathways.
Baseline vs. after weight reduction
Rank
EntrezID
GeneName
P-value
Pathway
RefSeqID
1
8365
BC010926.1
HIST1H4H -
2.38006191220476e-07
histone cluster
hsa05322: Systemic lupus
1, H4h
2
55973
NM 001008406.1
BCAP29 - B-cell
hsa05034: Alcoholism
erythematosus
6.2098504477583e-07
receptor-associated protein 29
3
1622
binding inhibitor
M15887.1
(GABA receptor
NM 001079862.1
modulator, acyl-
NM 001079863.1
CoA binding
NM 020548.5
4
DBI - diazepam
CR456956.1
55969
6.27199638434536e-07
pathway
protein)
C20orf24 -
AF274936.1
chromosome 20
BC001871.1
open reading
BC004446.1
frame 24
hsa03320: PPAR signaling
6.67608370868795e-07
NM 018840.2
NM 199483.1
109
Adipocyte
EHMN
Recon1
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
5
125
ADH1B - alcohol
hsa00010: Glycolysis/
C00111
NM 000668.3
dehydrogenase 1B
1.113560790908e-06
Gluconeogenesis
C00236
(class I), beta
hsa00071: Fatty acid
polypeptide
metabolism
C00332
hsa00350: Tyrosine
C00122
metabolism
C01036
C01036
C01179
hsa00830: Retinol
metabolism
hsa00980: Metabolism of
xenobiotics by cytochrome
P450
hsa00982: Drug metabolism- cytochrome P450
hsa01100: Metabolic
C00097
C00003
C00003
pathways
C00111
C00004
C00332
C00122
C00100
C00906
C00199
C00577
C01024
C00236
C01024
C01036
C00606
C01051
C01036
C03845
C01179
C05437
C03684
C05446
C05467
6
23086
AY099469.1
EXPH5 -
1.1786953108756e-06
exophilin 5
NM 015065.1
7
55904
AY147037.1
8
MLL5 - myeloid/
NM 018682.3
mixed-lineage
NM 182931.2
leukemia 5
1806
NM 000110.3
1.50324865193418e-06
lymphoid or
DPYD - dihydro-
hsa00310: Lysine
C00332
degradation
1.90562620638919e-06
hsa00240: Pyrimidine
pyrimidine
metabolism
dehydrogenase
hsa00410: beta-Alanine
C00906
C00100
metabolism
hsa00770: Pantothenate
C00097
and CoA biosynthesis
hsa00983: Drug metabolism - other enzymes
hsa01100: Metabolic
C00097
C00003
C00003
pathways
C00111
C00004
C00332
C00122
C00100
C00906
C00199
C00577
C01024
C00236
C01024
C01036
C00606
C01051
C01036
C03845
C01179
C05437
C03684
C05446
C05467
9
9669
NM 015904.3
EIF5B - eukaryotic
1.94232219398723e-06
hsa03013: RNA transport
2.29350303664699e-06
hsa04510: Focal adhesion
translation initiation factor 5B
10
1290
BC086874.1
NM 000393.3
COL5A2 collagen, type V,
hsa04512: ECM-receptor
alpha 2
interaction
BC043613.1
hsa04974: Protein digestion
C00097
and absorption
hsa05146: Amoebiasis
Table A.76: The top 10 differentially expressed genes from the comparison ’baseline vs. after weight
reduction’ with the corresponding pathways and reporter metabolites.
110
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Baseline vs. after weight maintenance phase
Rank
EntrezID
GeneName
P-value
Pathway
JAZF1 - JAZF zinc
6.73014226330345e-07
Adipocyte
EHMN
Recon1
hsa00052: Galactose
C00031
C00031
metabolism
C00124
RefSeqID
1
221895
NM 175061.3
2
128989
C22orf25 - chromo-
BC041339.1
some 22 open read-
NM 152906.2
3
finger 1
2720
M27508.1
7.17308804478182e-07
ing frame 25
GLB1 - galacto-
1.18268373244184e-06
sidase, beta 1
M34423.1
C00267
NM 000404.2
hsa00511: Other glycan
NM 001079811.1
degradation
hsa00531: Glycosaminoglycan degradation
hsa00600: Sphingolipid
C00195
metabolism
C01190
C00195
C01290
hsa00604: Glycosphingolipid biosynthesis ganglio series
hsa01100: Metabolic
C00010
C00001
C00001
pathways
C00024
C00005
C00005
C00083
C00006
C00006
C00100
C00024
C00031
C00197
C00031
C00080
C05272
C00064
C00195
C00124
G00019
C00195
G00163
C00267
G00164
C01190
C01290
hsa04142: Lysosome
4
9659
AL832024.2
5
6
PDE4DIP -
NM 001002811.1
4D interacting
NM 001002812.1
protein
116441
TM4SF18 - trans-
BC014339.1
membrane 4 L six
NM 138786.1
family member 18
2752
BC051726.1
1.55563331646695e-06
phosphodiesterase
GLUL - glutamate-
2.08304076554401e-06
3.67979597866496e-06
ammonia ligase
hsa00250: Alanine,
C00064
aspartate and glutamate
NM 001033044.1
metabolism
NM 001033056.1
hsa00330: Arginine and
NM 002065.4
C00064
proline metabolism
hsa00630: Glyoxylate and
C00024
C00024
dicarboxylate metabolism
C00100
C00064
C00197
hsa01100: Metabolic
C00010
C00001
C00001
pathways
C00024
C00005
C00005
C00083
C00006
C00006
C00100
C00024
C00031
C00197
C00031
C00080
C05272
C00064
C00195
C00124
G00019
C00195
G00163
C00267
G00164
C01190
C01290
hsa04724: Glutamatergic
C00064
synapse
hsa04727: GABAergic
synapse
7
57124
NM 020404.2
CD248 - CD248
3.83457619319249e-06
molecule,
endosialin
111
C00064
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Rank
EntrezID
GeneName
P-value
Pathway
Adipocyte
EHMN
Recon1
RefSeqID
8
54849
FLJ20186 -
BC015482.2
differentially
BC105592.1
expressed in FDCP
NM 017702.2
4.75472638872404e-06
8 homolog
NM 207514.1
9
54502
NM 019027.1
FLJ20273 - RNA
5.01882280669899e-06
binding motif protein 47
10
5783
PTPN13 - protein
D21209.1
tyrosine phosphat-
NM 080683.1
ase, non-receptor
D21210.1
type 13 (APO-1/
NM 006264.1
CD95 (Fas)-asso-
D21211.1
NM 080684.1
6.15726497888553e-06
ciated phosphatase)
NM 080685.1
U12128.1
Table A.77: The top 10 differentially expressed genes from the comparison ’baseline vs. after weight
maintenance phase’ with the corresponding pathways and reporter metabolites.
A.8.2
Comparison between the models
The following tables show the top 10 reporter metabolites of one model in comparison to the rank of
these metabolites using the other two models and the same expression data.
Baseline vs. after weight reduction
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C01036
4-Maleylacetoacetate
1
0.005185
19
0.00297981
4
0.0023486
C00536
Inorganic triphosphate
2
0.00926387
29
0.00607262
C00111
Dihydroxyacetone phosphate
3
0.0115888
91
0.0249449
51
7
0.00493835
0.0317427
C00606
3-Sulfino-L-alanine
4
0.0138944
52
0.0125323
12
0.0089768
C00097
L-Cysteine
5
0.0164945
109
0.0340762
102
0.0643997
C01179
3-(4-Hydroxyphenyl)pyruvate
6
0.0220697
1996
0.949555
640
0.539659
C00122
Fumarate
7
0.0227082
476
0.214497
87
0.053884
C00236
3-Phospho-D-glyceroyl phosphate
8
0.0243508
404
0.175407
323
0.242925
C00199
D-Ribulose 5-phosphate
9
0.0249655
64
0.0161579
19
0.0141511
C03684
6-Pyruvoyl-5,6,7,8-tetrahydropterin
10
0.0267392
29
0.00607262
23
0.0159692
Table A.78: The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’
using the adipocyte model in comparison to the EHMN and Recon 1 model.
112
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
P-value
Recon 1 model
Rank
P-value
C00412
Stearoyl-CoA
1
0.000119047
120
0.429406
349
0.275854
CE0852
palmitoleoyl-CoA
2
0.000193774
NA
NA
NA
NA
C00003
NAD+
3
0.000216192
75
C02249
Arachidonyl-CoA
4
0.000271037
NA
C00004
NADH
5
0.000309983
75
C01024
Hydroxymethylbilane
6
0.000391225
NA
NA
1
C00577
D-Glyceraldehyde
7
0.000495807
NA
NA
85
0.0528238
C02050
Linoleoyl-CoA
8
0.000546936
NA
NA
NA
NA
CE2254
docosanoyl-CoA
9
0.000760086
NA
NA
NA
NA
C00100
Propanoyl-CoA
9
0.000760086
240
0.796681
96
0.0615026
CE0713
3-oxolinoleoyl-CoA
10
0.000765249
NA
NA
NA
NA
0.240684
NA
0.240684
9
0.0051958
492
0.394575
15
0.0115902
0.000272881
Table A.79: The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’
using the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Metabolite name
Adipocyte model
Rank
P-value
Rank
NA
P-value
NA
EHMN model
Rank
Hydroxymethylbilane
1
0.000272881
C05437
zymosterol
2
0.000387591
C00906
5,6-Dihydrothymine
3
0.00163198
C01036
4-Maleylacetoacetate
4
0.0023486
C05446
3alpha,7alpha,12alpha,26-Tetrahydroxy-
5
0.00301134
NA
NA
734
0.360634
5
0.00301134
NA
NA
324
0.141624
NA
27
0.00591737
0.00926387
29
0.00607262
42
0.164672
NA
1
6
P-value
C01024
0.000391225
190
0.0680593
NA
85
0.0219635
0.005185
19
0.00297981
5beta-cholestane
C05467
3alpha,7alpha,12alpha-Trihydroxy-5beta-24oxocholestanoyl-CoA
C01051
Uroporphyrinogen III
6
0.00485445
NA
C00536
Inorganic triphosphate
7
0.00493835
2
C03845
Zymostenol
8
0.00514724
29
0.118452
460
C00003
Nicotinamide adenine dinucleotide
9
0.0051958
75
0.240684
3
C00332
Acetoacetyl-CoA
125
0.439106
20
10
0.00570785
0.205838
0.000216192
0.00349762
Table A.80: The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’
using the Recon 1 model in comparison to the adipocyte and EHMN model.
Baseline vs. after weight maintenance phase
Adipocyte model
KEGG ID
Metabolite name
EHMN model
Rank
P-value
Rank
10
P-value
P-value
C00024
Acetyl-CoA
1
0.000563428
C00010
Coenzyme A
2
0.00192595
270
0.105376
C00083
Malonyl-CoA
3
0.00537761
141
0.0546307
heptadecenoyl CoA (C17:1CoA, n-8)
4
0.00968814
NA
NA
NA
NA
1-Acyl-sn-glycerol 3-phosphate, adipocyte
5
0.00994868
NA
NA
NA
NA
C01342
Ammonium
6
0.011238
0.953823
641
0.546636
C00100
Propanoyl-CoA (C3:0CoA)
7
0.0129976
16
0.00261942
789
0.675516
C00197
3-Phospho-D-glycerate
8
0.0152006
109
0.0393705
408
0.32977
eicosadienoyl-CoA (C20:2CoA, n-6)
9
0.0163288
NA
NA
NA
NA
docosenoyl-CoA (C22:1CoA, n-9)
10
0.0194561
NA
NA
NA
NA
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
10
0.0194561
1748
0.833331
317
0.250358
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
10
0.0194561
820
0.389293
116
0.100258
1997
0.000879614
Recon 1 model
Rank
287
0.230087
574
0.479357
1013
0.844983
Table A.81: The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance
phase’ using the adipocyte model in comparison to the EHMN and Recon 1 model.
113
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
EHMN model
KEGG ID
Adipocyte model
Metabolite name
Rank
P-value
Rank
P-value
Recon 1 model
Rank
800
P-value
C00001
H2O
1
0.000000289791
164
0.548295
0.686042
C00124
D-Galactose
2
0.0000363024
NA
NA
11
0.0142181
C00267
alpha-D-Glucose
3
0.0000936421
NA
NA
NA
NA
C00031
D-Glucose
3
0.0000936421
87
0.244664
1086
C00006
NADP+
4
0.000202882
63
0.173565
3
C00064
L-Glutamine
5
0.000363629
270
0.926882
1018
C01290
beta-D-Galactosyl-1,4-beta-D-
6
0.000459548
NA
NA
0.890196
0.00380984
0.85136
933
0.789977
glucosylceramide
C01582
Galactose
6
0.000459548
NA
NA
NA
NA
C01190
Glucosylceramide
7
0.000463858
NA
NA
827
0.707438
C00195
N-Acylsphingosine
8
0.00048447
NA
NA
827
0.707438
C00005
NADPH
9
0.000500391
63
0.173565
5
C00024
Acetyl-CoA
10
0.000879614
165
0.564465
287
0.00583327
0.230087
Table A.82: The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance
phase’ using the EHMN model in comparison to the adipocyte and Recon 1 model.
Recon1 model
KEGG ID
Adipocyte model
Metabolite name
Rank
P-value
Rank
P-value
EHMN model
Rank
C00001
H2O
1
0.000273347
164
0.548295
211
C00080
H+
2
0.000603841
206
0.678839
1717
C00006
Nicotinamide adenine dinucleotide phosphate
3
0.00380984
63
0.173565
4
C00195
ceramide (homo sapiens)
4
0.00432081
NA
C00005
Nicotinamide adenine dinucleotide phosphate
5
0.00583327
63
NA
P-value
0.0771878
0.819442
0.000202882
996
0.488231
0.173565
1344
0.656163
0.244664
1527
0.737933
- reduced
C00031
D-Glucose
6
0.00871297
87
G00163
heparan sulfate, precursor 2
7
0.00982322
NA
NA
33
0.0090727
G00164
heparan sulfate, precursor 3
7
0.00982322
NA
NA
33
0.0090727
G00165
heparan sulfate, precursor 4
7
0.00982322
NA
NA
NA
NA
heparan sulfate, precursor 5
7
0.00982322
NA
NA
NA
NA
heparan sulfate, precursor 6
7
0.00982322
NA
NA
NA
NA
heparan sulfate, precursor 7
7
0.00982322
NA
NA
NA
NA
heparan sulfate, precursor 8
7
0.00982322
NA
NA
NA
NA
heparan sulfate, precursor 9
8
0.0106507
NA
NA
NA
NA
de-Fuc form of PA6 (w/o peptide linkage)
9
0.0115137
NA
NA
NA
NA
keratan sulfate I, degradation product 2
9
0.0115137
NA
NA
NA
NA
N-Acetyl-beta-D-glucosaminyl-1,2-alpha-D-
9
0.0115137
NA
NA
1795
9
0.0115137
NA
NA
NA
NA
9
0.0115137
NA
NA
NA
NA
G00019
0.853731
mannosyl-1,3-(N-acetyl-beta-D-glucosaminyl1,2-alpha-D-mannosyl-1,6)-(N-acetyl-betaD-glucosaminyl-1,4)-beta-D-mannosyl-1,4-Nacetyl-beta-D-glucosaminyl-R
n2m2nmasn (w/o peptide linkage)
protein-linked
asparagine
residue
(N-
glycosylation site)
C00237
Carbon monoxide
10
0.0123475
NA
NA
44
0.0114238
C00023
Fe2+
10
0.0123475
NA
NA
NA
NA
Table A.83: The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance
phase’ using the Recon 1 model in comparison to the adipocyte and EHMN model.
A.8.3
Comparison of expression data
The following tables show the comparison of the top 10 reporter metabolites using the different expression data of this dataset and the same model.
114
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
Adipocyte
baseline vs. reduction
KEGG ID
baseline vs. maintenance
Metabolite name
Rank
P-value
Rank
189
P-value
C01036
4-Maleylacetoacetate
1
0.005185
C00536
Inorganic triphosphate
2
0.00926387
C00111
Dihydroxyacetone phosphate
3
0.0115888
144
0.465898
C00606
3-Sulfino-L-alanine
4
0.0138944
234
0.752851
C00097
L-Cysteine
5
0.0164945
116
0.364113
C01179
3-(4-Hydroxyphenyl)pyruvate
6
0.0220697
198
0.654706
C00122
Fumarate
7
0.0227082
108
0.331964
C00236
3-Phospho-D-glyceroyl phosphate
8
0.0243508
18
C00199
D-Ribulose 5-phosphate
9
0.0249655
209
0.682199
C03684
6-Pyruvoyl-5,6,7,8-tetrahydropterin
10
0.0267392
59
0.165882
15
0.620879
0.0329617
0.0445467
Table A.84: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
reduction’ and ’baseline vs. after weight maintenance phase’ based on the adipocyte model.
baseline vs. maintenance
KEGG ID
Metabolite name
baseline vs. reduction
Rank
P-value
Rank
P-value
C00024
Acetyl-CoA
1
0.000563428
116
0.413092
C00010
Coenzyme A
2
0.00192595
169
0.583951
C00083
Malonyl-CoA
3
0.00537761
110
0.382474
heptadecenoyl CoA (C17:1CoA, n-8)
4
0.00968814
27
0.113801
1-Acyl-sn-glycerol 3-phosphate, adipocyte
5
0.00994868
37
0.14905
C01342
Ammonium
6
0.011238
244
0.805861
C00100
Propanoyl-CoA (C3:0CoA)
7
0.0129976
240
0.796681
C00197
3-Phospho-D-glycerate
8
0.0152006
128
0.44572
eicosadienoyl-CoA (C20:2CoA, n-6)
9
0.0163288
68
0.232662
docosenoyl-CoA (C22:1CoA, n-9)
10
0.0194561
70
0.233646
C05272
hexadecenoyl-CoA (C16:1CoA, n-9)
10
0.0194561
70
0.233646
C00510
octadecenoyl-CoA (C18:1CoA, n-7)
10
0.0194561
120
0.429406
Table A.85: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
maintenance phase’ and ’baseline vs. after weight reduction’ based on the adipocyte model.
EHMN
baseline vs. reduction
KEGG ID
baseline vs. maintenance
Metabolite name
Rank
P-value
Rank
P-value
C00412
Stearoyl-CoA
1
0.000119047
733
0.339824
CE0852
palmitoleoyl-CoA
2
0.000193774
1313
0.640054
C00003
NAD+
3
0.000216192
19
C02249
Arachidonyl-CoA
4
0.000271037
1896
C00004
NADH
5
0.000309983
31
C01024
Hydroxymethylbilane
6
0.000391225
484
0.201965
C00577
D-Glyceraldehyde
7
0.000495807
472
0.194984
C02050
Linoleoyl-CoA
8
0.000546936
1247
0.603414
CE2254
docosanoyl-CoA
9
0.000760086
654
C00100
Propanoyl-CoA
9
0.000760086
16
CE0713
3-oxolinoleoyl-CoA
10
0.000765249
1773
0.0030949
0.902041
0.00819121
0.29664
0.00261942
0.8454
Table A.86: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
reduction’ and ’baseline vs. after weight maintenance phase’ based on the EHMN model.
115
APPENDIX A. RESULTS OF ALL SELECTED DATASETS
baseline vs. maintenance
KEGG ID
Metabolite name
baseline vs. reduction
Rank
P-value
Rank
P-value
C00001
H2O
1
0.000000289791
1847
0.888155
C00124
D-Galactose
2
0.0000363024
2087
0.990244
C00267
alpha-D-Glucose
3
0.0000936421
2077
0.984357
C00031
D-Glucose
3
0.0000936421
1949
0.931825
C00006
NADP+
4
0.000202882
34
C00064
L-Glutamine
5
0.000363629
1948
0.931821
C01290
beta-D-Galactosyl-1,4-beta-D-glucosylceramide
6
0.000459548
2067
0.980227
C01582
Galactose
6
0.000459548
321
0.140361
C01190
Glucosylceramide
7
0.000463858
211
0.0766204
C00195
N-Acylsphingosine
8
0.00048447
957
0.479916
C00005
NADPH
9
0.000500391
813
0.40689
C00024
Acetyl-CoA
10
0.000879614
166
0.060106
0.00740209
Table A.87: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
maintenance phase’ and ’baseline vs. after weight reduction’ based on the EHMN model.
Recon 1
baseline vs. reduction
KEGG ID
baseline vs. maintenance
Metabolite name
Rank
P-value
Rank
P-value
C01024
Hydroxymethylbilane
1
0.000272881
261
0.216028
C05437
zymosterol
2
0.000387591
792
0.677306
C00906
5,6-Dihydrothymine
3
0.00163198
130
0.108296
C01036
4-Maleylacetoacetate
4
0.0023486
719
0.617654
C05446
3alpha,7alpha,12alpha,26-Tetrahydroxy-5beta-cholestane
5
0.00301134
45
0.0364003
C05467
3alpha,7alpha,12alpha-Trihydroxy-5beta-24-oxocholestanoyl-CoA
5
0.00301134
45
0.0364003
C01051
Uroporphyrinogen III
6
0.00485445
102
0.0817576
C00536
Inorganic triphosphate
7
0.00493835
54
0.0405944
C03845
Zymostenol
8
0.00514724
977
0.81583
C00003
Nicotinamide adenine dinucleotide
9
0.0051958
313
0.247902
C00332
Acetoacetyl-CoA
1143
0.932488
10
0.00570785
Table A.88: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
reduction’ and ’baseline vs. after weight maintenance phase’ based on the Recon 1 model.
baseline vs. maintenance
KEGG ID
Metabolite name
baseline vs. reduction
Rank
P-value
Rank
P-value
C00001
H2O
1
0.000273347
450
0.363208
C00080
H+
2
0.000603841
305
0.222867
C00006
Nicotinamide adenine dinucleotide phosphate
3
0.00380984
195
0.149033
C00195
ceramide (homo sapiens)
4
0.00432081
111
0.0722453
C00005
Nicotinamide adenine dinucleotide phosphate - reduced
5
0.00583327
248
0.186584
C00031
D-Glucose
6
0.00871297
815
0.682264
G00163
heparan sulfate, precursor 2
7
0.00982322
126
0.0881313
G00164
heparan sulfate, precursor 3
7
0.00982322
126
0.0881313
G00165
heparan sulfate, precursor 4
7
0.00982322
126
0.0881313
heparan sulfate, precursor 5
7
0.00982322
126
0.0881313
heparan sulfate, precursor 6
7
0.00982322
126
0.0881313
heparan sulfate, precursor 7
7
0.00982322
126
0.0881313
heparan sulfate, precursor 8
7
0.00982322
126
0.0881313
heparan sulfate, precursor 9
8
0.0106507
13
0.0102711
de-Fuc form of PA6 (w/o peptide linkage)
9
0.0115137
116
0.0805447
keratan sulfate I, degradation product 2
9
0.0115137
116
0.0805447
N-Acetyl-beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(N-acetyl-
9
0.0115137
116
0.0805447
n2m2nmasn (w/o peptide linkage)
9
0.0115137
116
0.0805447
protein-linked asparagine residue (N-glycosylation site)
9
0.0115137
116
0.0805447
G00019
beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,6)-(N-acetyl-beta-Dglucosaminyl-1,4)-beta-D-mannosyl-1,4-N-acetyl-beta-D-glucosaminyl-R
C00237
Carbon monoxide
10
0.0123475
771
0.636823
C00023
Fe2+
10
0.0123475
771
0.636823
Table A.89: The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
maintenance phase’ and ’baseline vs. after weight reduction’ based on the Recon 1 model.
116
APPENDIX - LIST OF TABLES
Appendix - List of Tables
A.1 The top 10 differentially expressed genes from the comparison ’before vs. after energy
restriction’ with the corresponding pathways and reporter metabolites. . . . . . . . . . .
66
A.2 The top 10 differentially expressed genes from the comparison ’after energy restriction
vs. after weight stabilization’ with the corresponding pathways and reporter metabolites. 66
A.3 The top 10 differentially expressed genes from the comparison ’before dietary intervention
vs. after weight stabilization’ with the corresponding pathways and reporter metabolites. 67
A.4 The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’
using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . .
68
A.5 The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’
using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . .
68
A.6 The top 10 reporter metabolites of the comparison ’before vs. after energy restriction’
using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . .
68
A.7 The top 10 reporter metabolites of the comparison ’after energy restriction vs. after
weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon
1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
A.8 The top 10 reporter metabolites of the comparison ’after energy restriction vs. after
weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon
1 model.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
A.9 The top 10 reporter metabolites of the comparison ’after energy restriction vs. after
weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN
model.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
A.10 The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after
weight stabilization’ using the adipocyte model in comparison to the EHMN and Recon
1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
A.11 The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after
weight stabilization’ using the EHMN model in comparison to the adipocyte and Recon
1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
117
70
APPENDIX - LIST OF TABLES
A.12 The top 10 reporter metabolites of the comparison ’before dietary intervention vs. after
weight stabilization’ using the Recon 1 model in comparison to the adipocyte and EHMN
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
A.13 The comparison of the top 10 reporter metabolites between ’before dietary intervention
vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after
energy restriction vs. after weight stabilization (WS)’ based on the adipocyte model. . .
71
A.14 The comparison of the top 10 reporter metabolites between ’before vs. after energy
restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and
’after energy restriction vs. after weight stabilization (WS)’ based on the adipocyte model. 72
A.15 The comparison of the top 10 reporter metabolites between ’after energy restriction vs.
after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization
(DI)’, and ’before vs. after energy restriction (ER)’ based on the adipocyte model. . . .
72
A.16 The comparison of the top 10 reporter metabolites between ’before dietary intervention
vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after
energy restriction vs. after weight stabilization (WS)’ based on the EHMN model. . . .
73
A.17 The comparison of the top 10 reporter metabolites between ’before vs. after energy
restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and
’after energy restriction vs. after weight stabilization (WS)’ based on the EHMN model.
73
A.18 The comparison of the top 10 reporter metabolites between ’after energy restriction vs.
after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization
(DI)’, and ’before vs. after energy restriction (ER)’ based on the EHMN model. . . . . .
74
A.19 The comparison of the top 10 reporter metabolites between ’before dietary intervention
vs. after weight stabilization (DI)’, ’before vs. after energy restriction (ER)’, and ’after
energy restriction vs. after weight stabilization (WS)’ based on the Recon 1 model. . . .
74
A.20 The comparison of the top 10 reporter metabolites between ’before vs. after energy
restriction (ER)’, ’before dietary intervention vs. after weight stabilization (DI)’, and
’after energy restriction vs. after weight stabilization (WS)’ based on the Recon 1 model. 74
A.21 The comparison of the top 10 reporter metabolites between ’after energy restriction vs.
after weight stabilization (WS)’, ’before dietary intervention vs. after weight stabilization
(DI)’, and ’before vs. after energy restriction (ER)’ based on the Recon 1 model. . . . .
75
A.22 The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin sensitive omental tissue’ with the corresponding pathways and reporter metabolites. 77
A.23 The top 10 differentially expressed genes from the comparison ’insulin resistant vs. insulin sensitive subcutaneous tissue’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
118
77
APPENDIX - LIST OF TABLES
A.24 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
omental tissue’ using the adipocyte model in comparison to the EHMN and Recon 1
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
A.25 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
omental tissue’ using the EHMN model in comparison to the adipocyte and Recon 1
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
A.26 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
omental tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
A.27 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
subcutaneous tissue’ using the adipocyte model in comparison to the EHMN and Recon
1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
A.28 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
subcutaneous tissue’ using the EHMN model in comparison to the adipocyte and Recon
1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
A.29 The top 10 reporter metabolites of the comparison ’insulin resistant vs. insulin sensitive
subcutaneous tissue’ using the Recon 1 model in comparison to the adipocyte and EHMN
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
A.30 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’
based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
A.31 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’
based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
A.32 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’
based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
A.33 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’
based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
A.34 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive omental tissue’ and ’insulin resistant vs. insulin sensitive subcutaneous tissue’
based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
A.35 The comparison of the top 10 reporter metabolites between ’insulin resistant vs. insulin
sensitive subcutaneous tissue’ and ’insulin resistant vs. insulin sensitive omental tissue’
based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
119
82
APPENDIX - LIST OF TABLES
A.36 The top 10 differentially expressed genes from the comparison ’active vs. non-active’
with the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . .
84
A.37 The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the
adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . .
84
A.38 The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the
EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . .
85
A.39 The top 10 reporter metabolites of the comparison ’active vs. non-active’ using the
Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . .
85
A.40 The top 10 differentially expressed genes from the comparison ’African Americans vs.
Hispanics’ with the corresponding pathways and reporter metabolites. . . . . . . . . . .
87
A.41 The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’
using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . .
87
A.42 The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’
using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . .
88
A.43 The top 10 reporter metabolites of the comparison ’African Americans vs. Hispanics’
using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . .
88
A.44 The top 10 differentially expressed genes from the comparison ’WM - before LCD vs.
after LCD’ with the corresponding pathways and reporter metabolites. . . . . . . . . . .
90
A.45 The top 10 differentially expressed genes from the comparison ’WR - before LCD vs.
after LCD’ with the corresponding pathways and reporter metabolites. . . . . . . . . . .
91
A.46 The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’
using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . .
92
A.47 The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’
using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . .
92
A.48 The top 10 reporter metabolites of the comparison ’WM - before LCD vs. after LCD’
using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . .
93
A.49 The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’
using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . .
93
A.50 The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’
using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . .
94
A.51 The top 10 reporter metabolites of the comparison ’WR - before LCD vs. after LCD’
using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . .
94
A.52 The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after
LCD’ and ’WR - before LCD vs. after LCD’ based on the adipocyte model. . . . . . . .
95
A.53 The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after
LCD’ and ’WM - before LCD vs. after LCD’ based on the adipocyte model. . . . . . . .
120
95
APPENDIX - LIST OF TABLES
A.54 The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after
LCD’ and ’WR - before LCD vs. after LCD’ based on the EHMN model. . . . . . . . .
96
A.55 The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after
LCD’ and ’WM - before LCD vs. after LCD’ based on the EHMN model. . . . . . . . .
96
A.56 The comparison of the top 10 reporter metabolites between ’WM - before LCD vs. after
LCD’ and ’WR - before LCD vs. after LCD’ based on the Recon 1 model. . . . . . . . .
97
A.57 The comparison of the top 10 reporter metabolites between ’WR - before LCD vs. after
LCD’ and ’WM - before LCD vs. after LCD’ based on the Recon 1 model. . . . . . . . .
97
A.58 The top 10 differentially expressed genes from the comparison ’day 0 vs. day 14’ with
the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . . . .
99
A.59 The top 10 differentially expressed genes from the comparison ’day 0 vs. day 56’ with
the corresponding pathways and reporter metabolites. . . . . . . . . . . . . . . . . . . . 100
A.60 The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the adipocyte
model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . 101
A.61 The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the EHMN
model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . 101
A.62 The top 10 reporter metabolites of the comparison ’day 0 vs. day 14’ using the Recon 1
model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . 101
A.63 The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the adipocyte
model in comparison to the EHMN and Recon 1 model. . . . . . . . . . . . . . . . . . . 102
A.64 The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the EHMN
model in comparison to the adipocyte and Recon 1 model. . . . . . . . . . . . . . . . . . 102
A.65 The top 10 reporter metabolites of the comparison ’day 0 vs. day 56’ using the Recon 1
model in comparison to the adipocyte and EHMN model. . . . . . . . . . . . . . . . . . 103
A.66 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day
0 vs. day 56’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . 103
A.67 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day
0 vs. day 14’ based on the adipocyte model. . . . . . . . . . . . . . . . . . . . . . . . . . 104
A.68 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day
0 vs. day 56’ based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . 104
A.69 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day
0 vs. day 14’ based on the EHMN model. . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.70 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 14’ and ’day
0 vs. day 56’ based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.71 The comparison of the top 10 reporter metabolites between ’day 0 vs. day 56’ and ’day
0 vs. day 14’ based on the Recon 1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . 106
121
APPENDIX - LIST OF TABLES
A.72 The top 10 differentially expressed genes from the comparison ’normoxic vs. hypoxic
conditions’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 107
A.73 The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’
using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 108
A.74 The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’
using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 108
A.75 The top 10 reporter metabolites of the comparison ’normoxic vs. hypoxic conditions’
using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 108
A.76 The top 10 differentially expressed genes from the comparison ’baseline vs. after weight
reduction’ with the corresponding pathways and reporter metabolites. . . . . . . . . . . 110
A.77 The top 10 differentially expressed genes from the comparison ’baseline vs. after weight
maintenance phase’ with the corresponding pathways and reporter metabolites. . . . . . 112
A.78 The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’
using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . . . . . 112
A.79 The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’
using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . . . . . 113
A.80 The top 10 reporter metabolites of the comparison ’baseline vs. after weight reduction’
using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . . . . . 113
A.81 The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance
phase’ using the adipocyte model in comparison to the EHMN and Recon 1 model. . . . 113
A.82 The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance
phase’ using the EHMN model in comparison to the adipocyte and Recon 1 model. . . . 114
A.83 The top 10 reporter metabolites of the comparison ’baseline vs. after weight maintenance
phase’ using the Recon 1 model in comparison to the adipocyte and EHMN model. . . . 114
A.84 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
reduction’ and ’baseline vs. after weight maintenance phase’ based on the adipocyte
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A.85 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
maintenance phase’ and ’baseline vs. after weight reduction’ based on the adipocyte
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A.86 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
reduction’ and ’baseline vs. after weight maintenance phase’ based on the EHMN model. 115
A.87 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
maintenance phase’ and ’baseline vs. after weight reduction’ based on the EHMN model. 116
A.88 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
reduction’ and ’baseline vs. after weight maintenance phase’ based on the Recon 1 model.116
122
APPENDIX - LIST OF TABLES
A.89 The comparison of the top 10 reporter metabolites between ’baseline vs. after weight
maintenance phase’ and ’baseline vs. after weight reduction’ based on the Recon 1 model.116
123
Acknowledgement
I would like to express my gratitude to several people, without whose support it would not have been
possible to write this diploma thesis.
First of all, I would like to thank Univ.-Prof. Dr. Zlatko Trajanoski, director of the Section of Bioinformatics of the Medical University Innsbruck, who gave me the possibility to work on this interesting
topic.
Moreover, I want to express my gratitude to both, Univ.-Prof. DI Dr. Zlatko Trajanoski and DI(FH)
Dr. Stephan Pabinger, Section of Bioinformatics of the Medical University Innsbruck, for the excellent
support and guidance during the last nine month.
I also want to thank Univ.-Prof. Dr. habil. Matthias Dehmer, head of the Institute for Bioinformatics
and Translational Research at the UMIT, for being my supervisor on the part of the UMIT.
In addition, I want to thank all my friends for encouraging and motivating me.
Finally, I want to say a special thank to my parents, who enabled me the studies at the UMIT.
124
Statutory declaration
Eidesstattliche Erklärung
I hereby declare that this diploma thesis has been written only by the undersigned and without any
assistance from third parties. Furthermore, I confirm that no sources have been used in the preparation
of this thesis other than those indicated in the thesis itself.
Hiermit erkläre ich an Eides statt, die Arbeit selbstständig verfasst und keine anderen als die angegebenen Hilfsmittel verwendet zu haben.
......................................................
Signature/Unterschrift
125