Full Text - Briefings in Bioinformatics

B RIEFINGS IN BIOINF ORMATICS . VOL 15. NO 2. 212^228
Advance Access published on 21 May 2013
doi:10.1093/bib/bbt028
Inference of dynamic networks using
time-course data
Yongsoo Kim*, Seungmin Han*, Seungjin Choi and Daehee Hwang
Submitted: 22nd January 2013; Received (in revised form) : 12th March 2013
Abstract
Cells execute their functions through dynamic operations of biological networks. Dynamic networks delineate the
operation of biological networks in terms of temporal changes of abundances or activities of nodes (proteins and
RNAs), as well as formation of new edges and disappearance of existing edges over time. Global genomic and proteomic technologies can be used to decode dynamic networks. However, using these experimental methods, it is
still challenging to identify temporal transition of nodes and edges. Thus, several computational methods for estimating dynamic topological and functional characteristics of networks have been introduced. In this review, we summarize concepts and applications of these computational methods for inferring dynamic networks and further
summarize methods for estimating spatial transition of biological networks.
Keywords: dynamic network; spatiotemporal dynamics; network inference
INTRODUCTION
Dynamic operations of biological networks involve
(i) changes in abundances or activities of nodes (e.g.
protein, RNA and DNA) and (ii) formation of new
edges and disappearance of existing edges over time.
To measure the temporal changes of nodes and
edges, global assay techniques have been used.
Genomic technologies, such as microarrays and
mass spectrometry-based methods, permit to measure temporal changes in abundances of nodes.
Furthermore, several methods, such as mass spectrometry-based tandem affinity purification (TAP
[1]) and chromatin immunoprecipitation-sequencing
(ChIP-seq) [2], can be used to measure temporal
transitions of protein–protein interactions (PPIs)
and protein–DNA interactions (PDIs).
Time-course genomic analysis (e.g. gene expression profiling) to experimentally measure temporal
transitions of nodes has been a common practice.
Biological networks describing the temporal transitions of nodes measured by the time-course genomic
analysis can be reconstructed using PPIs and PDIs
deposited in interactome databases, such as Human
Protein Reference Database (HPRD) [3], Biological
General Repository for interaction datasets (BioGRID) [4] and Biomolecular Interaction Network
Database (BIND) [5]. However, these interactomes
include false-positive and -negative interactions [6].
Thus, the biological networks reconstructed using
these interactomes are limited to describe temporal
transitions of interactions (i.e. formation of new
edges or disappearance of existing edges over time).
Time-course interactome analysis could be performed to experimentally measure the temporal
transitions of edges. However, they provide only
limited coverage of the interactomes to decode temporal transitions of individual edges at each time
point [6, 7].
Corresponding author. Daehee Hwang. School of Interdisciplinary Bioscience and Bioengineering, POSTECH, Pohang, 790-784,
Republic of Korea. Tel.: 82-54-279-2393; Fax: 82-54-279-8409; E-mail: [email protected]
*These authors equally contributed to this work.
Yongsoo kim is a graduate student in School of Interdisciplinary Bioscience and Bioengineering at POSTECH. He has been
developing machine-learning approaches for network analyses using various omics data.
Seungmin Han is a graduate student in School of Interdisciplinary Bioscience and Bioengineering at POSTECH. He has been
studying design principles of biological networks via mathematical modeling approaches.
Seungjin Choi is a professor in Department of Computer Science and Engineering and Division of IT Convergence Engineering at
POSTECH. He leads the research group on machine learning.
Daehee Hwang is an associate professor in School of Interdisciplinary Bioscience and Bioengineering and Department of Chemical
Engineering at POSTECH. He leads the research group on systems biology and medicine.
ß The Author 2013. Published by Oxford University Press. For Permissions, please email: [email protected]
Inference of dynamic networks
Because of these limitations in the experimental
methods and huge amounts of experimental costs,
a number of computational methods for inferring
dynamic networks using time-course global data
have been developed (Table 1). These methods
rely on statistical dependence of measured data for
inferring active nodes and/or edges: they identify
nodes showing expression changes over time and
infer edges between pairs of the nodes with statistical
dependence in temporal expression changes.
Dynamic networks estimated by these methods can
be represented as: (i) a single network in which an
edge represents a regulatory relationship between a
pair of nodes across the whole time span and (ii) a set
of networks each of which represents activation of
nodes and/or edges in each time point or each segment of the time span, thereby delineating temporal
transitions of nodes and/or edges.
In this review, we summarize computational methods for inferring dynamic networks using time-course
global data. These methods can be categorized based
on characteristics of the resulting networks and the
strategies used in these methods: (i) whether the resulting networks delineate temporal transition of
nodes and/or edges inferred from time-course data
(columns in Figure 1), (ii) whether the edges in the
resulting networks are directed or undirected (rows in
the left of Figure 1) and (iii) whether the inference
methods integrate information other than timecourse gene expression data (e.g. GO terms or
ChIP-chip; rows in the right of Figure 1). Five
types of methods have been often used to reconstruct
a dynamic network for time-course data: (i) Bayesian
network (BN) [8], (ii) relevance network (RN) [9],
(iii) Markov Random Field (MRF) [10], (iv) ordinary
differential equations (ODE) [11] and (v) logic-based
models [12]. BN, RN and MRF-based methods
identify the presence of the interactions between
the molecules in the network (i.e. network structure).
ODE [11] and logic-based models [12], in addition to
the network structure, further identify the parameters
(i.e. reaction parameters for ODEs and Boolean logic
gates for logic-based models) for the interactions, permitting to simulate network outputs under external
perturbations. However, they have a limitation to
infer a dynamic network with a large number of
nodes. The inference of moderately large dynamic
networks is more feasible by BN, RN and MRFbased methods than by ODE and logic-based
models. Here, we thus focus on the approaches to
infer moderately large dynamic networks from
time-course global data.
Table 1: Publicly available software
Name of
Software
Description
Link
Minet [82]
R package that implements algorithms based on mutual
information, including ARACNE [29] and MRNET
[32].
R package that implements algorithms based on graphical Gaussian models (GGMs) that represent multivariate dependencies by means of partial correlation.
R package that implements time-delay ARACNE.
http://bioconductor.org/packages/2.3/bioc/html/minet.html
GeneNet [31]
TDARACNE [50]
dpcnet [52]
BNFinder [21]
globalmit [20]
PNA [39]
TESLA [46]
KELLER [10]
ARTIVA [45]
213
R package that implements directed partial correlation
(DPC) method.
Software for inferring both static and dynamic
networks.
Matlab/Cþþ toolkit for learning globally optimal DBN
structure in terms of mutual information scoring
metric.
Web-based tool for PNA that captures principal activation patterns and their associated subnetworks
from microarray data.
Software for estimating time-varying networks from
time-series data.
Software for estimating time-varying networks from
time-series data.
R package for inferring time-varying DBN networks
from time-series data
http://www.strimmerlab.org/software/genenet/index.html
http://www.bioconductor.
org/packages/release/bioc/html/TDARACNE.html
http://code.google.com/p/dpcnet/
http://bioputer.mimuw.edu.pl/software/bnf/
http://code.google.com/p/globalmit/
http://sbm.postech.ac.kr/pna/index.php
http://cogito-b.ml.cmu.edu/tesla/index.html
http://cogito-b.ml.cmu.edu/keller/
http://cran.r-project.org/web/packages/ARTIVA/index.html
214
Kim et al.
Figure 1: Categorization of dynamic network inference methods. Dynamic network inference methods are categorized based on the strategies used to infer dynamic networks and characteristics of the inferred networks. The
methods are first categorized into three groups, according to whether the resulting networks delineate (i) temporal
associations, (ii) temporal transitions of nodes and (iii) temporal transitions of edges (columns). Second, the methods
are further categorized according to whether the edges in the resulting networks are directed or undirected
(rows in the left). Finally, the methods are categorized according to whether they integrate other complementary
data with time-course gene expression data (Integrative) or not (not integrative; rows in the right).
Many eukaryotic proteins, up to 35%, are localized in multiple compartments [13], leading to
interactions with distinct partners in different compartments. Unfolding dynamic networks further into
intracellular organelles (e.g. nucleus, cytosol or mitochondria) can provide additional insights into the
operation of biological networks. Several experimental approaches and computational methods for
inferring spatial transition of nodes and edges in biological networks have been developed. Thus, we also
summarize these methods for inferring spatiotemporal operations of biological networks. The
abbreviations for technical terminologies used in
this review are summarized in Table 2.
NETWORKS WITH TEMPORAL
ASSOCIATIONS
In the simplest dynamic network, an edge represents
a regulatory relationship between a pair of nodes
(e.g. genes and proteins) across the whole time
span (‘Network with Temporal Associations’ in
Figure 1). Several methods have been developed
for estimating the regulatory network structure
Inference of dynamic networks
Table 2: Table of abbreviations
Abbreviation
Full name
AD
ARTIVA
BN
CC
DAG
DBN
DEG
DPI
GIN
GO
GOBP
GOCC
GOMF
KELLER
MI
MRF
nhDBN
nsDBN
ONMF
PDI
PMI
PNA
PPI
PTM
RN
SANDY
TAP
TESLA
Alzheimer’s disease
Auto regressive time varying models
Bayesian network
Correlation coefficient
Directed acyclic graph
Dynamic Bayesian network
Differentially expressed gene
Data processing inequality
Gene interaction network
Gene ontology
Gene ontology biological process
Gene ontology cellular compartment
Gene ontology molecular function
Kernel-re-weighted logistic regression
Mutual information
Markov Random Field
Non-homogenous dynamic Bayesian network
Non-stationary dynamic Bayesian network
Orthogonal non-negative matrix factorization
Protein^DNA interaction
Protein-metabolite interaction
Principal network analysis
Protein^ protein interaction
Post-translational modification
Relevance network
Statistical analysis of network dynamics
Tandem affinity purification
Temporally smoothed l1-regularized logistic
regression
Transcription factor
Transcription factor-binding site
Transcriptional regulatory network
Time-varying dynamic Bayesian network
Yeast two-hybrid
TF
TFBS
TRN
TV-DBN
Y2H
using time-course gene expression data. These methods evaluate statistical dependence between the
nodes throughout the whole time span using the
time-course gene expression data, resulting in a
single network with temporal regulatory relationships [14]. These methods include BN, dynamic
BN (DBN) and RN methods, which have been
applied to identify transcriptional regulatory networks (TRNs) and gene interaction networks
(GINs). In this review, we used GINs to indicate
the networks, including the links estimated based
on the correlation in time-course gene expression
data, which are not transcriptional factor (TF)target interactions in TRNs or genetic relationships
in genetic regulatory networks.
BN methods have been developed to infer regulatory relationships between the genes using timecourse gene expression data [8]. BN is a probabilistic
graphical model that represents conditional
215
dependence among the nodes using directed acyclic
graphs (DAGs). BN methods infer static and directed
edges that represent active regulatory relationships
between the nodes across the whole time span
(Bayesian network in Figure 2). For example,
Friedman et al. [8] introduced a BN method to reconstruct a GIN for 800 genes with expression
changes using time-course gene expression data measured at different phases (M/G1, G1, S, G2 and M)
during cell cycle of Saccharomyces cerevisiae [15]. The
network revealed dynamic activation of diverse
modes of transcriptional regulation (e.g. transcriptional cascades or feedforward loops) during the
cell cycle. Although the BN methods can infer temporal regulatory associations, they do not take into
account the temporal order in the data [i.e. observed
gene expression values are assumed to be independent and identically distributed (iid)]. Therefore, temporal associations involving time-delays among the
genes are not likely to be detected by these methods.
Also, these methods use a DAG as the structure of
BN network, and thus cannot detect cyclic regulatory relationships among the nodes, such as feedback
loops.
DBN methods [16–18] were developed to use the
temporal order of the time-course data in inferring
regulatory relationships between the nodes. The
DBN method developed by Friedman etal. [16] constructs prior and transition networks (Dynamic
Bayesian network in Figure 2). The prior network
represents the statistical dependence among the
nodes at the initial time point, and the transition
network represents the dependence between the
nodes at t and t þ 1. Unlike BNs, the transition network can include cyclic regulatory relationships
among the nodes (see feedback loops in the inferred
network structure of Figure 2). Kim etal. [19] applied
BN and DBN methods to the aforementioned
time-course gene expression data generated during
cell cycle of S. cerevisiae [15] and then compared the
resulting GINs. They found that the DBN method,
compared with the BN method, generated a GIN
with less false-positive interactions according to
KEGG cell cycle pathways. A major drawback of
both BN and DBN methods is that the search
space for candidate network structures exponentially increases as the number of the nodes in the
networks increases. Thus, BN and DBN methods
are used to estimate the networks with only a
small number of nodes. Recently, to resolve this
problem, efficient algorithms and tools, such as
216
Kim et al.
Figure 2: Inferred example networks. The networks with temporal associations inferred from time-course gene
expression data (first row) are represented as BNs, DBNs and RNs. Using time-course gene expression data, BN
and RN methods infer directed and undirected relationships between the nodes (xi), respectively. These edges are
estimated from statistical dependence between the nodes across the whole time span. DBN methods first infer a
prior BN and then a transition network showing directed relationships between the nodes (xit and xitþ1) at t and
t þ1. The inferred network from DBN contains two feedback loops, x1!x1 and x1!x2!x3!x1. The dynamic network with temporal transition of nodes (second row) involves activation of a set of nodes (xit) in the network (Gt)
at each t. The nodes activated at t are linked (thick lines) when they interact with each other based on known interactomes. The dynamic networks with temporal transition of edges (third row) involve activation of a set of nodes
and edges in the network at each t. Non-stationary DBNs (nsDBNs; Im) show transition of directed edges in m
time ranges. The activation of node 2 (x2) by node 1 (x1) disappeared, and the positive regulation of node 3 (x3) by
node 1 (x1) was newly formed at second time range (m ¼ 2). MRF networks (GT ) show transition of undirected
edges at each t.
GlobalMIT [20] and BNFinder [21] have been
developed [20–23].
RN methods [9, 24, 25] have been also used to
reconstruct GINs [9, 24] and TRNs [25] using
time-course gene expression data. For the inference,
they use relevance measures for pairs of the nodes
(Relevance network in Figure 2), such as correlation
coefficients (CC) [9] and mutual information (MI)
Inference of dynamic networks
[24, 25]. For example, Remondini etal. [9] generated
time-course gene expression data at five time points
(1, 2, 4, 8 and 16 h) from rat fibroblast cells conditionally expressing Myc-estrogen receptor oncoprotein. They estimated temporal associations among
genes based on their correlation of time-course
gene expression and then reconstructed a GIN
using the estimated associations. Using the network,
they identified transregulation cascades among
c-Myc and its 130 target genes. Unlike the BN
and DBN methods, RN methods can effectively reconstruct the networks with large numbers of nodes.
However, both MI and CC cannot discriminate indirect links from direct ones, thus leading to a large
number of false-positive interactions. To reduce the
false positives, several RN methods used post-processing after constructing relevance networks
[26–28]. In the algorithm for the reconstruction of
accurate cellular networks (ARACNE; [29]), falsepositive links are reduced by removing indirect links
using data processing inequality (DPI), a criterion
that can discriminate direct links from indirect
ones. In other RN methods, alternative association
measures, such as partial correlation coefficient
(ParCorA [30] and GeneNet [31]) and conditional
mutual information (MRNET [32]), have been
used to sort out direct associations from indirect
ones. In particular, Gaussian graphical model
(GGM) is an undirected graphical model to estimate
the conditional dependence between nodes based on
partial correlation coefficient. Efficient algorithms
for the inference using GGM have been developed
[33–35].
DYNAMIC NETWORKS WITH
TEMPORAL TRANSITION OF
NODES
All the aforementioned methods estimate the networks with temporal regulatory relationships (first
row in Figure 2). However, these networks do
not provide how nodes and edges become active
at individual time points and how their activities
evolve over time. Several approaches (Dynamic
networks with temporal transition of nodes in
Figure 2) have been introduced to reconstruct dynamic networks to overcome these drawbacks.
These approaches have been commonly applied to
identify dynamic networks, including PPIs and
PDIs. These methods first identify differentially expressed genes (DEGs) over time and then clustered
217
the DEGs based on their differential expression patterns or other criteria, such as functional similarity
of the genes. In each time point, a network is then
reconstructed for the genes in the clusters by linking the genes with known interactions (e.g. PPIs
and PDIs) from public databases, such as HPRD
[3], BioGRID [4] and BIND [5]. Fold-changes of
the genes are used to represent activation of the
nodes in the network at each time point (transitional networks in Figure 2). The edges represent
potential interactions among the nodes but not necessarily the statistical dependence between the
nodes. The networks at multiple time points can
provide temporal activation of the nodes and their
transition over time.
Luscombe et al. [36] reconstructed dynamic TRNs
in S. cerevisiae using this approach. They identified
455 genes showing expression changes over time
using time-course gene expression data generated
during the cell cycle and then grouped them into
five phase-specific clusters (early and late G1, S, G2
and M phases). Dynamic TRNs were constructed for
70 TFs and their target genes in the five clusters. The
TRNs revealed temporal activation of the TFs and
their targets at the five cell cycle phases and also their
transition along the cell cycle. Also, using this approach, Hwang et al. [37] identified 923 genes showing expression changes along the progression of
prion diseases, categorized them into four groups
representing PrPSc accumulation, microglial/astrocytic activation, synaptic degeneration and neuronal
cell death, and then reconstructed dynamic networks
delineating temporal activation of the genes in individual groups during the progression of prion
diseases. Moreover, this approach was applied to
time-course phosphoproteomic profiles to generate
dynamic signaling networks. To monitor temporal
dynamics of phosphoproteome on growth factor
stimulation, Olsen et al. [38] detected 6600 phosphorylation sites on 2244 proteins and measured
the changes in their abundance at 1, 5, 10 and
20 min after treating HeLa cells with epidermal
growth factor (EGF). Then, they clustered 1046
phosphopeptides regulated by the EGF signaling
into seven functional protein classes based on the
temporal transition patterns in their abundance.
They then reconstructed dynamic signaling networks
that delineate transition of phosphopeptides belonging to the individual protein classes and thus dynamic
activation of cellular functions associated with the
protein classes upon the EGF stimulation.
218
Kim et al.
In these studies, the procedures for identification
and clustering of DEGs and network modeling were
performed separately using different tools. To automate these procedures, Kim et al. [39] developed a
framework, called principal network analysis (PNA)
that can perform these procedures at once using
orthogonal non-negative matrix factorization
(ONMF). By applying the ONMF to time-course
gene expression data, PNA identifies differential expression patterns and then selects not only genes
showing the differential expression patterns but also
edges linking the genes that show the similar expression patterns among all the known PPIs or PDIs.
Using these selected nodes and edges, PNA automatically generates a subnetwork describing both nodes
and edges showing each differential expression pattern. Furthermore, PNA combines subnetworks for
multiple differential expression patterns. These combined subnetworks provide new insights into temporal activation of the subnetworks. PNA was
applied to the time-course gene expression data collected during the progression of prion diseases. The
resulting networks revealed that the genes related to
PrPSc accumulation and microglial/astrocytic activation were early activated and densely connected, reflecting activation of immune responses in microglia
and astrocytes caused by PrPSc accumulation and
thus coordinated activation of these cellular processes
in the early stage of prion diseases.
DYNAMIC NETWORKS WITH
TEMPORAL TRANSITION OF
EDGES
The aforementioned methods identify only temporal
transition of the nodes from the time-course gene
expression data and link the nodes using the
known PPI and PDI interactomes. Thus, the resulting networks can include false-positive edges that are
not active under conditions being investigated.
However, on environmental stimuli, new sets of
genes and proteins are induced or repressed over
time. Proteins can also move from one compartment
to another on their activation and change their interacting partners at the two different compartments.
These dynamic changes in localization of proteins
lead to formation of new edges and disappearance
of existing edges. Thus, inference of the transition
of the edges reflecting the above situations can provide new insights into dynamic operation of biological networks. Several methods to infer temporal
activation of edges and their transition over time
have been developed (dynamic network with temporal transition of edges in Figure 2). These methods
include variants of DBNs, such as non-stationary BN
(nsDBN) and time-varying DBN (TV-DBN) and
Markov Random Field (MRF)-based methods,
which have been applied to identify GINs and
TRNs.
Variants of DBNs have been developed for inferring temporal transition of edges in dynamic GINs
from time-course gene expression data [40–43]. For
example, non-stationary Bayesian network (nsDBN)
developed by Jia and Huan [41] decomposes the
whole time span into several segments of the time
span and then reconstructs a DBN within each time
segment, assuming that the network structure for the
time points within the time segment is the same
(nsDBN in Figure 2). Using this nsDBN, Jia and
Huan [41] reconstructed time-varying GINs from
three sets of time-course gene expression data measured from macrophage, Arabidopsis and Drosophila.
Another extension of DBN is time-varying DBN
(TV-DBN) [42]. Unlike nsDBN involving segmentation of the whole time span, TV-DBN estimates a
network structure for a fixed set of nodes at each
time point using a kernel re-weighted auto-regressive method. For continuous transitions of network
structures, this method penalizes gene expression
data from distant time points more than those from
close time points using kernel function. Dimitrakopoulou et al. [44] used TV-DBN to reconstruct
dynamic GINs from time-course data measured at
every day after the infection of mouse with influenza
A virus up to 5 days. They first identified 3500 DEGs
and clustered them into 35 groups using k-means
clustering. To reduce the complexity, they then reconstructed TV-DBNs for the 35 groups using the
mean gene expression profiles in the groups. The
resulting dynamic networks revealed that the interactions among the groups including early innate immunity-related genes (Tlr1 and Tlr2 and Tlr6) were
highly activated until 3 days, whereas interactions
among the groups including the outputs of the
early innate immunity (Ifnb1 and Ifnar2) showed sustained activation up to 5 days. Thus, these examples
showed that nsDBN and TV-DBN can be used to
effectively model temporal activation and transition
of nodes and edges during the course of cellular
events. However, these methods require discretization of gene expression data, leading to loss of
information. In contrast, auto regressive time varying
Inference of dynamic networks
models (ARTIVA) [45] can estimate time-evolving
networks without the discretization of gene expression data. ARTIVA was used to reconstruct GINs
using two sets of time-course expression data collected during the developmental stages of D. melanogaster and the response of S. cerevisiae to benomyl
poisoning. These applications demonstrated that
ARTIVA can recover essential temporal dependencies between the genes.
MRF-based methods [10, 46] have been also
developed to reconstruct dynamic GINs using
time-course gene expression data. Unlike BNs that
use DAGs, MRF methods use an undirected probabilistic graphical model that represents statistical dependence for the nodes, thus permitting to infer
cyclic regulatory relationships (MRF network in
Figure 2). Among the MRF methods, temporally
smoothed l1-regularized logistic regression (TESLA
[46]; total variation in Kolar etal. [47]) and kernel-reweighted logistic regression (KELLER [10]; smooth
in Kolar et al. [47]) use binary information (e.g. activation or non-activation) of gene expression data to
reconstruct dynamic networks by logistic regression.
These methods commonly use l1-regularized logistic
regression formulation. However, they use different
methods to achieve smooth transition of dynamic
networks over time. TESLA minimizes total variation of interactions in the networks over time to
generate similar structures in dynamic networks at
adjacent time points. On the other hand, KELLER
uses kernel weight function for the likelihood being
maximized for network inference, thus generating
the network structures with smooth transition at adjacent time points. Song et al. [10] applied KELLER
to time-course gene expression data measured during
the developmental course of D. melanogaster and generated GINs of 588 genes at 66 time points during
the four developmental stages: embryonic, larval,
pupal and adult stages. The resulting 66 time-evolving networks provided activation of edges among
the genes at the four stages and their transitions along
the stages. For example, the interaction between
misshapen (msn) and dreadlocks (dock) was activated
during embryonic stage, but then disappeared in the
following stages [48]. Moreover, 30 and 27 genes
related to embryonic and post-embryonic developments, respectively, have highest interactivities
among themselves at the corresponding stages. This
result implies that the time-evolving networks wellreflect temporal activation and transition of nodes
and edges during the different developmental stages.
219
These methods to infer dynamic networks with
temporal transition of edges use a larger number of
parameters to be optimized than those for networks
with temporal associations and temporal transition of
nodes, leading to an increased search space of network structures and also often to difficulties in identifying the optimal network structures. Thus, these
methods suffer from a large number of candidate
structures of dynamic networks for a time-course
gene expression data set. To overcome this drawback, efficient optimization algorithms or additional
constraints on network structure (e.g. integration of
biological knowledge from literatures) are needed.
DIRECTED AND UNDIRECTED
DYNAMIC NETWORKS
Dynamic networks inferred from time-course gene
expression data can be further categorized into directed and undirected networks, according to
whether edges in the network are directed or undirected (rows in the left of Figure 1). The directionality of edges in the network can be interpreted as
the causality in interactions between nodes. The
causal relationships may produce additional insights
into regulatory mechanisms in dynamic networks,
such as signaling or transcriptional cascades, and feedback and feedforward loops. The methods based on
probabilistic graphical models reconstruct either directed or undirected networks according to their
graphical representations: BN methods (BN, DBN,
nsDBN, TV-DBN, nhDBN and ARTIVA) generate
directed networks, whereas MRF methods (TESLA
and KELLER) generate undirected networks.
RN methods generate undirected dynamic networks because of the non-directional nature of the
relevance measures (i.e. CC and MI). To infer causal
relationships between the nodes, several RN methods, such as a pattern recognition approach [49]
and TimeDelay-ARACNE [50], used modified
relevance measures to incorporate time-delay
[26, 49–51]. For example, time-lagged correlation,
a modified Pearson correlation, was used to maximally capture the correlation in the time-course
data [51]. The causal relationships inferred using
this time-lagged correlation result in a directed
GIN that includes cyclic regulatory relationships
like the original GIN. Also, Zhao et al. [26] used a
time-lagged MI to infer a directed dynamic network
that describes muscle development during the life
cycle of Drosophila. In addition, Yuan et al. [52]
220
Kim et al.
proposed directed partial CC to reconstruct a directed dynamic network describing starch metabolism
in Arabidopsis from time-course gene expression data.
Partial CC for a pair of nodes means the correlation
between the residual temporal expression profiles
that cannot be explained by the rest of nodes in
the network. To infer causal relationship from
node i to node j, the method evaluates the change
in partial CC after removal of node i and then conclude that there should be a directed link from node
i to node j if the change in partial CC is significant
based on t-test. Finally, the methods for inferring
dynamic networks with temporal transition of
nodes (middle column in Figure 1) generate directed
or undirected dynamic networks depending on directionality information of known PPIs and PDIs
used. For example, undirected networks are commonly generated using PPIs, whereas directed networks can be generated using interactions from
signaling pathways in KEGG pathway database.
INTEGRATIVE INFERENCE OF
DYNAMIC NETWORKS
Estimation of a dynamic network from time-course
gene expression data is an under-determined problem. Thus, there are a number of candidate network
structures that can generate observed time-course
gene expression data. When there are a large
number of candidate network structures (e.g. when
temporal transition of edges is estimated), other data
complementary to time-course gene expression data
can be integrated to reduce the number of possible
structures, leading to accurate inference of dynamic
networks. Various types of the complementary data
have been used (rows in the right of Figure 1). For
example, known PPIs and PDIs interactomes are
used as prior information in statistical inference of
the conditional dependence between the nodes.
Furthermore, in silico predicted PDI data, such as
TF-binding sites (TFBSs) predicted from motif scan
methods, and experimentally measured PDI data,
such as ChIP-chip or ChIP-seq data, can be integrated to infer reliable edges between TFs and
their target genes in dynamic TRNs.
These data have been integrated to the models
used to estimate dynamic networks in various
manners to reduce the search space of network structures. BN methods cannot be practically applicable
for a large number of genes. Thus, several BN methods (e.g. DBN, nsDBN and TV-DBN) integrated
gene-function information to select the genes
(nodes) to be included in dynamic networks. For
example, Ong et al. [18] selected 15 genes and 9
operons involved in tryptophan metabolism in
Escherichia coli and developed a tryptophan metabolism-related dynamic network describing regulatory
relationships among these selected operons and
genes. Alternatively, the methods for inferring dynamic networks with temporal transition of nodes
(i.e. DEGs) integrated known interactions to
reduce the edge search space into the known interactomes used [36, 37, 39]. This approach can reduce
false-positive interactions that result from the methods to estimate the edges among all possible pairs of
DEGs (e.g. TV-DBN and KELLER). PNA selects
only an active subset from the known interactomes
to reduce false-positive interactions that could not be
activated based on time-course gene expression data.
Furthermore, previous knowledge on the system
can be integrated at the model construction stage.
Ong et al. [18] integrated an operon map that represents co-regulated clusters (operons) of genes, with
time-course gene expression data measured from
E. coli. They modeled a dynamic network in which
operons were used as nodes, edges were inferred to
link the operon nodes and the co-regulated genes
were used as subnodes that are connected to the
corresponding operon node according to the predefined operon map. The integration of the
operon map led to reduction in the search space of
network structures because of the smaller number of
operons than that of genes. Furthermore, the resulting dynamic network correctly captured known
interactions between the genes involved in tryptophan metabolism in E.coli, through the links between
the operons including the genes, which could not be
inferred without integration of the operon map.
Moreover, the data can be integrated into the inference algorithms, in addition to the model as
described earlier in the text. For example, a BN
method developed by Tamada et al. [53] integrated
promoter sequences. This method first estimates an
initial network using gene expression profiles obtained after disruption of 100 transcriptional regulators in S. cerevisiae. Based on this initial network, they
detected consensus motifs in promoter sequences of
target genes regulated by common transcriptional
regulators. Using the consensus motifs, the method
re-estimated the network. The procedure is repeated
until the estimated network structure is converged.
The comparison of inferred networks with and
Inference of dynamic networks
without integration of promoter sequences revealed
that the integration of promoter sequences resulted
in a more accurate network than that inferred without the integration. For instance, without the integration, GAL2 was inferred as an upstream regulator
of GAL11. This erroneous link was corrected in the
network inferred with the integration of promoter
sequences. Therefore, this example demonstrates that
integration of the data to both the model and the
inference algorithm can lead to more accurate inference of dynamic networks [53].
FUNCTIONAL INTERPRETATION
OF DYNAMIC NETWORKS
Dynamic networks from time-course gene expression data provide the bases for cellular processes sequentially activated and regulatory relationships for
their sequential activation (e.g. signaling or transcriptional cascades and feedback or feedforward loops).
Thus, interpretation of dynamic networks estimated
from the aforementioned methods is essential to
understand temporal activation of the cellular processes and also mechanisms underlying their temporal
activation. Several strategies have been introduced
for functional interpretation of dynamic networks.
Most of these strategies are based on GO analysis
of the nodes in the dynamic networks. These methods typically identify GO terms enriched by the
nodes in a network constructed at each time point
and then compare the enriched GO terms at consecutive time points to understand temporal transitions of cellular processes associated with the GO
terms (Node-based ontology analysis in Figure 3A).
For example, Song et al. [10] identified GO terms
related to developmental processes (e.g. muscle development and wing vein morphogenesis) for the
nodes in networks at all time points and then identified interactions of these GO terms at individual
time points. Each pair of two GO terms is connected
when the total number of links between the genes
assigned with the two GO terms in the network at
each time point exceeds a cut-off value (Figure 3B).
The temporal transition of the interactions between
the GO terms represents activation of their associated
cellular processes (individual GO terms) and their
collective activation (a set of GO terms) over time.
Temporal transition of cellular processes is defined
not only by nodes (genes or proteins) but also by
their physical/functional interactions [54, 55]. The
node-based GO analysis is difficult to distinguish
221
the difference between cellular functions defined
by the networks with the same list of nodes, but
with different sets of edges. This raises the necessity
to consider both nodes and edges for interpretation
of dynamic networks.
To address this issue, a recent study developed an
edge-based GO analysis method [56]. This analysis
(edge-based ontology analysis in Figure 3A) first assigns link ontologies based on GO terms of the two
interacting nodes in the network and then identifies
enriched link ontologies at each time point. For
example, Wang et al. [56] constructed dynamic networks using gene expression data generated from
incipient, moderate and severe stages of Alzheimer’s
disease (AD) and applied this method to interpret the
dynamic networks. The node-based GO analysis
could not result in GO terms that can be used to
distinguish different stages of AD progression. In
contrast, the edge-based GO analysis correctly identified GO terms specific to each stage of AD: (i) vesicle-mediated transport at the incipient stage; (ii)
regulation of kinase activity for the moderate stage;
and (iii) sterol transport, apoptosis and proteolysis at
the severe stage. Thus, the edge-based GO analysis
more effectively distinguishes the difference of cellular functions represented by the networks at individual time points, thereby leading to better
understanding of temporal transition of cellular functions delineated by dynamic networks.
In addition to the GO analysis, temporal transitions of characteristics of dynamic networks (e.g. distributions of shortest paths, degrees of centrality and
clustering coefficients) have been also analyzed [10].
Furthermore, temporal changes of network motifs in
dynamic networks provide useful insights into transition of kinetic behaviors in outputs of dynamic
networks (network motif analysis in Figure 3A).
Network motifs are recurring building blocks in biological networks, including feedback and feedforward loops [57]. Several tools, such as FANMOD
[58] and MAVisto [59], have been developed to
identify overrepresented network motifs in a biological network. Comparison of the overrepresented
regulatory motifs in dynamic networks across time
points can reveal disappearance of existing motifs and
formation of new regulatory motifs, providing the
knowledge of how dynamic behaviors in response
to a certain stimulus can vary over time. For example, Kim et al. [60] selected DEGs from timecourse gene expression data generated at multiple
stages of Drosophila embryogenesis and reconstructed
222
Kim et al.
Figure 3: Functional interpretation of dynamic networks. (A) Node-based gene ontology (GO) analysis (first row)
results in GO terms (node GO terms) enriched by a list of the nodes (xit) in a dynamic network (Gt) at t. In contrast,
edge-based ontology analysis (second row) defines link GO terms using GO terms of the nodes linked by the
edges, resulting in enriched link GO terms in a dynamic network at t. Dynamic networks G1 and G2 in the first
and second rows have the same list of nodes, but different lists of edges. Thus, the node-based analysis produces
the GO terms at different time points (node GO terms in first row), whereas the edge-based analysis generates different GO terms over time (link GO terms in second row). Network motif analysis (third row) identifies regulatory
motifs included in a dynamic network at t and compares the lists of regulatory motifs at consecutive time points
to decode transitions of the regulatory motifs. For example, G1 has a feedforward loop, but G2 has both feedback
and feedforward loops. Thus, the feedback loop was newly formed at t ¼ 2, whereas the feedforward loop was activated at t ¼1 and 2 and then disappeared at t ¼ 3. (B) Temporal transition of interactions between gene ontology
(GO) terms. Along the progression of cellular events, GO terms B, C and D were activated at stage 1, but GO
term A was newly activated and GO term B became inactivated at stage 2, and the interaction between
GO terms C and D disappeared at stage 3, which reflects reduced interactions among the genes associated with
GO terms C and D at stage 3.
active GINs at different stages of the embryogenesis
using TRANSFAC database. By comparing the
overrepresented network motifs at individual time
points, they identified that feedback loops appeared
more frequently at later stages than at early stages.
Furthermore, based on GO analysis of the nodes in
these network motifs, the genes in the feedback
loops were related to cell fate determination at the
Inference of dynamic networks
late stages, implying utilization of the feedback loops
for achieving cell fate determination. This is consistent with previous findings that bistability of positive
feedbacks serves as a cell fate determination mechanism [61].
SPATIAL TRANSITION OF NODES
AND EDGES
In eukaryotic cells, the internal space of a cell is
compartmentalized, permitting to convert or diversify functions of proteins by modulating their spatial
distribution on external stimuli or post-translational
modifications (PTMs). Proteins localized at the same
subcellular compartment are often functionally associated through their interactions, as shown in phenotypically similar diseases and disease comorbidity
[62]. More than 35% of the cellular proteome are
localized at multiple compartments [63, 64]. If a protein is localized, for instance, in nucleus and cytosol,
it will function differently in the compartments
[65, 66] by forming spatially distinct interaction networks. Thus, in addition to temporal transition, spatial transition of proteins and their interactions
should be examined to understand dynamic behaviors in biological systems. Abnormal changes of spatial distribution of proteins can contribute to disease
pathogenesis. For example, abnormal distributions of
proteins between the nucleus and cytosol have been
observed in various types of carcinoma cells [67].
Thus, deciphering spatial transition of proteins can
be crucial to develop therapeutic means to prevent
disease progression.
To understand spatial transition of proteins and
their interactions, various experimental and computational approaches for decoding spatial information
of proteins have been developed [13]. Proteomic
analysis followed by subcellular fractionation is used
to analyze the proteomes in subcellular compartments. The fractionation techniques include not
only classical methods, such as differential centrifugation, density-gradient centrifugation and differential detergent fractionation, but also emerging
methods, such as free-flow electrophoresis, immunoaffinity purification and fluorescent-assisted organelle sorting. Proteomics analysis reveals lists of
proteins localized in individual organelles, as well as
interacting partners of the proteins and their PTMs
in different organelles. The proteomic data are deposited into several databases, such as Organelle DB,
MitoCarta, LOCATE and SUBA [68–71]. Also, a
223
number of in silico tools for predicting protein localization have been developed. The tools include protein sequence-based methods (e.g. signalP [72],
PredSL [73] and Predotar [74]), annotation-based
methods (e.g. ProLog-GO [75] and PSLpred [76]),
mixtures of two strategies (e.g. MultiLoc [77] and
WoLF-PSORT [78]) and methods that systematically integrate prediction results from other tools (e.g.
ConLoc [79]). For the dynamic networks with the
information of spatial transition of proteins, the
aforementioned tools for functional interpretation
of dynamic networks (e.g. node- and edge-based
GO analyses) can be applied to understand how cellular functions are distinctively activated in different
organelles and how they transit over time. Similarly,
regulatory motifs defined by spatial segregation of
proteins and how they change over time can be
identified using the motif detection algorithms
aforementioned.
In addition, mathematical modeling using ordinary or partial differential equations can be used to
understand functional roles of the spatial transition
of proteins and their interactions. Mathematical
modeling of biological networks enables us to
understand dynamic operation of the networks.
Combining mathematical modeling with network
motif theory can shed light on understanding the
role of dynamic/multiple localization of proteins in
the regulation of network dynamics. For example,
Santos etal. [80] showed that a positive feedback loop
triggering the cell cycle entry is formed by cyclin B1
phosphorylation and nuclear translocation of active
Cdk1-cyclin B1. Mathematical modeling of this
feedback loop revealed that the spatially coded positive feedback ensures decisiveness and irreversibility
of the cell cycle entry. Mathematical modeling of
biological networks involving proteins localized in
multiple compartments is based on the assumption
that molecules within the same compartment are
homogeneously distributed. When this assumption
does not hold, partial differential equations should
be used to incorporate diffusion effect of molecules
within the same compartment [81].
Integrating spatial information of proteins into dynamic networks enables to decode molecular networks governing biological processes in both space
and time. Specifically, we can unfold a dynamic network at each time point into multiple networks in individual subcellular compartments, compare spatially
unfolded networks at consecutive time points and
thus understand the transitions of the proteins and
224
Kim et al.
the regulatory motifs in space and time (Figure 4).
Protein A activates protein B in cytosol (t ¼ 1); the
activated cytosolic protein B then translocates into
nucleus and then induces gene C, which induces
gene B (t ¼ 2), thereby forming a transcriptional
feedback loop between genes B and C; protein C
induced by protein B is then localized in cytosol and
activates protein D (t ¼ 3); and proteins C and D
translocate into mitochondria in which protein C
inactivates protein D, but activates protein E,
which in turn activates protein D (t ¼ 4). This spatial
unfolding of a dynamic network revealed a simple
activation of protein D by protein C in cytosol, as
well as a feedforward loop among proteins C, D and
E in mitochondria, which could not be identified
without consideration of spatial unfolding.
Furthermore, the unfolding can help clarify conflicting information regarding regulatory relationships
between proteins C and D, depending on whether
data represent either cytosolic or mitochondrial
interaction between them. Thus, the space- and
time-evolving nature of the networks provides a
comprehensive basis for understanding the roles of
key molecules and network motifs in diverse physiological and pathological events. Despite the importance of spatial transition of the nodes, however, the
experimental and computational tools are not sufficient to decode transition of the networks in both
space and time. Thus, new experimental and computational tools to improve understanding of spatiotemporal transition of the networks are needed.
DISCUSSION
In this review, we presented the methods for inferring dynamic networks using time-course global
data. We categorized these methods according to
the presence and absence of temporal transition of
nodes and/or edges. First, BN, DBN and RN methods estimate dynamic networks with temporal regulatory relationships, but with no transition of nodes
and edges. Second, the methods developed by
Figure 4: Spatial unfolding of dynamic networks. A dynamic network (Gt) (first row) shows temporal activation of
nodes and edges at t (t ¼1, 2, 3 and 4). A spatially unfolded dynamic network (St) (second row) shows activation
and translocation of nodes and their interactions in three subcellular compartments (cytosol, nucleus and mitochondria) at t. Capital letters denote proteins (nodes), and the subscripts represent the subcellular compartments
(C: cytosol, M: mitochondria and N: nucleus). The comparisons of G and S at t ¼ 3 and 4 show the use of spatial
unfolding of a dynamic network for understanding of spatiotemporal transition of nodes and edges. In S3, D is positively regulated by C in cytosol. In S4, however, D is negatively regulated by C after translocation of C and D to
mitochondria. These spatially unfolded relationships can clarify the conflicting data regarding positive and negative
regulations of D by C without consideration of spatial unfolding. Furthermore, after their translocation, S4 shows a
feedforward loop newly formed in mitochondria at t ¼ 4. These examples demonstrate that spatiotemporally evolving networks can provide a comprehensive basis for understanding transitions of biological systems.
Inference of dynamic networks
Hwang et al. [37] and Luscombe et al. [36], as well as
PNA, estimate dynamic networks with temporal
transition of nodes. These methods identify genes
or proteins showing differential expression over
time and then clustered them. A network is reconstructed for the molecules in each cluster at individual time points by linking the molecules with known
PPI/PDI interactomes. Third, nsDBN and TVDBN identify directed dynamic networks with temporal transition of edges. Several MRF methods,
such as TESLA and KELLER, also estimate undirected dynamic networks involving temporal transition of edges.
BN and DBN methods generate directed dynamic networks, but with only a small number of
nodes because of a huge search space of candidate
network structures for a large number of the nodes.
Therefore, these methods are not effective to infer
dynamic networks with a large number of the
nodes. To overcome this drawback, several efficient
algorithms (e.g. GlobalMIT and BNFinder) to
search for optimal network structures from the
huge search space have been developed.
Moreover, additional constraints on network structure have been used to reduce the search space
during the network inference by integrating biological knowledge from literatures or the data complementary to the time-course global data used,
such as known PPIs and PDIs (TFBSs and ChIPchip/seq) and GO annotations. On the other hand,
RN methods can effectively identify dynamic networks with large numbers of nodes. However, RN
methods using conventional measures (CC and MI)
cannot discriminate indirect links from direct ones,
resulting in a large number of indirect edges (i.e.
false-positive edges) in the resulting networks. To
reduce the false-positive edges, several RN methods, such as ParCorA, GeneNet and MRNET
involving alternative relevance measures (e.g. partial
CC and conditional MI) and ARACNE involving
the post-processing of the inferred dynamic networks, have been developed. Also, to estimate directionality of the edges, RN methods with directed
partial and time-delayed relevance measures (e.g.
CC or MI) were developed. Moreover, the methods that infer dynamic networks with temporal
transition of nodes rely on the known interactomes.
Therefore, they cannot identify the interactions not
included in the known interactomes. This drawback can be overcome partially by using the methods to infer temporal transition of edges.
225
In this review, we presented the node- and edgebased GO analyses for functional interpretation of dynamic networks. Despite these analyses, it is still challenging to interpret dynamic networks in appropriate
biological contexts. The current network inference
methods are mostly designed to model statistical dependence between the nodes for the inference of
regulatory relationships. To better extract biological
meaning from inferred dynamic networks, however,
it would be essential to integrate other data related to
functions of the nodes and edges, such as GO biological processes (GOBPs) and/or molecular functions
(GOMFs), into the models and the inference algorithms for dynamic networks. In this integrative inference, the nodes should be more likely to interact,
when they show similar temporal expression patterns
and have the same GOBPs, and further to transit together in dynamic networks by forming a functional
module (i.e. densely connected nodes) associated with
the GOBPs. Multiple functional modules can be
formed when the nodes with the same GOBPs
show several different patterns of temporal gene expression. Thus, these functional modules can directly
guide functional interpretation of the dynamic networks, as well as identification of regulatory motifs
and understanding of their transitions.
There has been the lack of tools for inferring spatial transition of the nodes from global assay data. To
decode spatiotemporally evolving networks of genes
and proteins, a new method is required that can
integrate spatial information (e.g. GO cellular compartments; GOCCs) of proteins with time-course
gene expression data during the inference of dynamic networks. In this method, the nodes with
the same GOCCs should be more likely to be
linked when the nodes show similar temporal expression patterns, thereby forming a functional
module for the GOCC. Thus, the functional modules can guide interpretation of spatially unfolded
dynamic networks, identification of regulatory
motifs essential for spatiotemporal transitions of the
networks and understanding of transitions of the key
regulatory motifs in both space and time.
Finally, the categorization of the network inference methods in this review can serve as a useful
guideline to select suitable tools for one’s own data
and study. One can define the type of dynamic networks needed to answer biological questions in the
study based on the following aspects: (i) whether
dynamic networks should describe transitions of the
nodes and/or edges, (ii) whether the networks
226
Kim et al.
should be directed or undirected and (iii) what kinds
of data should be integrated into the inference of the
dynamic networks to answer biological questions.
For each of these aspects of the dynamic networks,
one can refer to the corresponding sections in this
review to determine an appropriate method for the
defined type of the dynamic networks. Furthermore,
one should consider the following computational
issues: (i) how large dynamic networks should be
estimated and (ii) how much information the
global time-course data has. To reconstruct dynamic
networks involving transition of edges over time,
relatively small amounts of data points compared
with the number of nodes keep the methods from
reliable estimation of temporal transition of edges.
For example, one needs to estimate directed dynamic
networks from time-course gene expression of 2000
genes measured at 10 time points. In this case, the
number of nodes is much larger than that of samples
(i.e. time points). It is desirable to use either a
method for inferring dynamic networks with temporal association (e.g. RN methods using directed
partial CC) or PNA using the known interactomes.
However, when one can reduce the number of
nodes by focusing on the networks related to a cellular process through integration of GOBPs, directed
dynamic networks with temporal transition of edges
can be inferred using TV-DBN and nsDBN. When
multiple methods can be applied, one can determine
the best dynamic networks after comparing the performance of the methods in answering the biological
questions through the resulting dynamic networks.
Key Points
We summarized 29 computational methods for estimating dynamic networks using time-course global data after categorizing
them according to the presence and absence of temporal transition of nodes and/or edges, directionality of edges and integration of other data than time-course data.
We presented methods for (i) understanding functional transition represented by dynamic networks, (ii) detecting overrepresented network motifs in dynamic networks, (iii) analyzing
temporal transition of network characteristics and (iv) modeling
kinetic behaviors represented by dynamic networks.
We finally discussed methods for unfolding dynamic networks
into multiple subcellular compartments. This spatial unfolding of
dynamic networks can provide understanding of transitions of
biological networks both in time and space.
Project grant (NRF-M1AXA002–2011-0028392),
Core research program of National Research
Foundation
of
Korea
(2012–0005032),
Proteogenomic Research Program and WCU program (R31–2008-000–10105-0 and R31–10100), as
well as Nex-Generation BioGreen 21 Program
(PJ009072) from Korean Rural Development
Administration.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
FUNDING
Korean MEST grants from the Converging Research
Center Program (2011K000896), Global Frontier
16.
Collins SR, Kemmeren P, Zhao XC, et al. Toward a comprehensive atlas of the physical interactome of Saccharomyces
cerevisiae. Mol Cell Proteomics 2007;6:439–450.
Park PJ. ChIP-seq: advantages and challenges of a maturing
technology. Nat Rev Genet 2009;10:669–680.
Keshava Prasad TS, Goel R, Kandasamy K, et al. Human
Protein Reference Database—2009 update. Nucleic Acids
Res 2009;37:D767–D772.
Stark C, Breitkreutz BJ, Chatr-Aryamontri A, et al. The
BioGRID Interaction Database: 2011 update. Nucleic Acids
Res 2011;39:D698–D704.
Bader GD, Betel D, Hogue CW. BIND: the Biomolecular
Interaction Network Database. Nucleic Acids Res 2003;31:
248–250.
von Mering C, Krause R, Snel B, et al. Comparative assessment of large-scale data sets of protein-protein interactions.
Nature 2002;417:399–403.
Sprinzak E, Sattath S, Margalit H. How reliable are experimental protein-protein interaction data? J Mol Biol 2003;
327:919–923.
Friedman N, Linial M, Nachman I, et al. Using Bayesian
networks to analyze expression data. J Comp Biol 2000;7:
601–620.
Remondini D, O’Connell B, Intrator N, et al. Targeting cMyc-activated genes with a correlation method: detection
of global changes in large gene expression network dynamics. Proc Natl Acad Sci USA 2005;102:6902–6906.
Song L, Kolar M, Xing EP. KELLER: estimating timevarying interactions between genes. Bioinformatics 2009;25:
i128–i136.
Bansal M, Belcastro V, Ambesi-Impiombato A, et al. How
to infer gene networks from expression profiles. Mol Syst
Biol 2007;3:122.
Morris MK, Saez-Rodriguez J, Sorger PK, etal. Logic-based
models for the analysis of cell signaling networks.
Biochemistry 2010;49:3216–3224.
Lee YH, Tan HT, Chung MC. Subcellular fractionation
methods and strategies for proteomics. Proteomics 2010;10:
3935–3956.
Przytycka TM, Singh M, Slonim DK. Toward the dynamic
interactome: it’s about time. Brief Bioinform 2010;11:15–29.
Spellman PT, Sherlock G, Zhang MQ, et al.
Comprehensive identification of cell cycle-regulated genes
of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998;9:3273–3297.
Friedman N, Murphy K, Russell S. (1998), Learning the
Structure of Dynamic Probabilistic Networks, UAI.
Inference of dynamic networks
17. Kim S, Imoto S, Miyano S. Dynamic Bayesian network and
nonparametric regression for nonlinear modeling of gene
networks from time series gene expression data. Biosystems
2004;75:57–65.
18. Ong IM, Glasner JD, Page D. Modelling regulatory pathways in E. coli from time series expression profiles.
Bioinformatics 2002;18(Suppl 1):S241–S248.
19. Kim SY, Imoto S, Miyano S. Inferring gene networks from
time series microarray data using dynamic Bayesian networks. Brief Bioinform 2003;4:228–235.
20. Vinh NX, Chetty M, Coppel R, et al. GlobalMIT: learning
globally optimal dynamic bayesian network with the mutual
information test criterion. Bioinformatics 2011;27:2765–
2766.
21. Wilczynski B, Dojer N. BNFinder: exact and efficient
method for learning Bayesian networks. Bioinformatics
2009;25:286–287.
22. Scutari M. Learning Bayesian Networks with the bnlearn R
Package. J Stat Softw 2010;35:1–22.
23. Zou M, Conzen SD. A new dynamic Bayesian network
(DBN) approach for identifying gene regulatory networks
from time course microarray data. Bioinformatics 2005;21:71–
79.
24. Butte AJ, Kohane IS. Mutual information relevance networks: functional genomic clustering using pairwise entropy
measurements. Pac Symp Biocomput 2002;415–426.
25. Faith JJ, Hayete B, Thaden JT, et al. Large-scale mapping
and validation of Escherichia coli transcriptional regulation
from a compendium of expression profiles. PLoS Biol
2007;5:e8.
26. Zhao W, Serpedin E, Dougherty ER. Inferring gene regulatory networks from time series data using the minimum
description length principle. Bioinformatics 2006;22:2129–
2135.
27. Margolin AA, Nemenman I, Basso K, et al. ARACNE: an
algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC
Bioinformatics 2006;7(Suppl 1):S7.
28. Altay G, Emmert-Streib F. Inferring the conservative
causal core of gene regulatory networks. BMC Syst Biol
2010;4:132.
29. Basso K, Margolin AA, Stolovitzky G, et al. Reverse engineering of regulatory networks in human B cells. Nat Genet
2005;37:382–390.
30. de la Fuente A, Bing N, Hoeschele I, et al. Discovery of
meaningful associations in genomic data using partial correlation coefficients. Bioinformatics 2004;20:3565–3574.
31. Schafer J, Strimmer K. An empirical Bayes approach to
inferring
large-scale
gene
association
networks.
Bioinformatics 2005;21:754–764.
32. Meyer PE, Kontos K, Lafitte F, et al. Information-theoretic
inference of large transcriptional regulatory networks.
EURASIP J Bioinform Syst Biol 2007;2007:79879.
33. Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 2008;9:
432–441.
34. Krumsiek J, Suhre K, Illig T, et al. Gaussian graphical modeling reconstructs pathway reactions from high-throughput
metabolomics data. BMC Syst Biol 2011;5:21.
35. Yuan M, Lin Y. Model selection and estimation in the
Gaussian graphical model. Biometrika 2007;94:19–35.
227
36. Luscombe NM, Babu MM, Yu H, etal. Genomic analysis of
regulatory network dynamics reveals large topological
changes. Nature 2004;431:308–312.
37. Hwang D, Lee IY, Yoo H, et al. A systems approach to
prion disease. Mol Syst Biol 2009;5:252.
38. Olsen JV, Blagoev B, Gnad F, etal. Global, in vivo, and sitespecific phosphorylation dynamics in signaling networks.
Cell 2006;127:635–648.
39. Kim Y, Kim TK, Yoo J, et al. Principal network analysis:
identification of subnetworks representing major dynamics
using gene expression data. Bioinformatics 2011;27:391–398.
40. Robinson JW, Hartemink AJ. Non-stationary dynamic
Bayesian networks. Proceeding of the Conference on Neural Information Processing Systems, Vol. 3. Vancouver, B. C., Canada:
Neural Information Processing Systems, 2008, 1372–79.
41. Jia Y, Huan J. Constructing non-stationary Dynamic
Bayesian Networks with a flexible lag choosing mechanism.
BMC Bioinformatics 2010;11(Suppl 6):S27.
42. Song L, Kolar M, Xing EP. Time-Varying Dynamic Bayesian
Networks. Proceeding of the Conference on Neural Information
Processing Systems, Vol. 1. Vancouver, B. C., Canada: Neural
Information Processing Systems, 2009, 1732–39.
43. Grzegorczyk M, Husmeier D. Non-homogeneous dynamic
Bayesian networks for continuous data. Mach Learn 2011;83:
355–419.
44. Dimitrakopoulou K, Tsimpouris C, Papadopoulos G, et al.
Dynamic gene network reconstruction from gene expression data in mice after influenza A (H1N1) infection. J Clin
Bioinforma 2011;1:27.
45. Lebre S, Becq J, Devaux F, et al. Statistical inference of the
time-varying structure of gene-regulation networks. BMC
Syst Biol 2010;4:130.
46. Ahmed A, Xing EP. Recovering time-varying networks of
dependencies in social and biological studies. Proc Natl Acad
Sci USA 2009;106:11878–11883.
47. Kolar M, Song L, Ahmed A, et al. Estimating Time-Varying
Networks. Ann Appl Stat 2010;4:94–123.
48. Su YC, Maurel-Zaffran C, Treisman JE, et al. The Ste20
kinase misshapen regulates both photoreceptor axon targeting and dorsal closure, acting downstream of distinct signals.
Mol Cell Biol 2000;20:4736–4744.
49. Chuang CL, Jen CH, Chen CM, etal. A pattern recognition
approach to infer time-lagged genetic interactions.
Bioinformatics 2008;24:1183–1190.
50. Zoppoli P, Morganella S, Ceccarelli M. TimeDelayARACNE: Reverse engineering of gene networks from
time-course data by an information theoretic approach.
BMC Bioinformatics 2010;11:154.
51. Schmitt WA, Jr, Raab RM, Stephanopoulos G. Elucidation
of gene interaction networks through time-lagged correlation analysis of transcriptional data. Genome Res 2004;14:
1654–1663.
52. Yuan Y, Li CT, Windram O. Directed partial correlation:
inferring large-scale gene regulatory network through
induced topology disruptions. PLoS One 2011;6:e16835.
53. Tamada Y, Kim S, Bannai H, et al. Estimating gene networks from gene expression data by combining Bayesian
network model with promoter element detection.
Bioinformatics 2003;19(Suppl 2):ii227–ii236.
54. Kitano H. Systems biology: a brief overview. Science 2002;
295:1662–1664.
228
Kim et al.
55. Zhong Q, Simonis N, Li QR, et al. Edgetic perturbation
models of human inherited disorders. MolSyst Biol 2009;5:321.
56. Wang J, Huang Q, Liu ZP, et al. NOA: a novel network
ontology analysis method. Nucleic Acids Res 2011;39:e87.
57. Alon U. Network motifs: theory and experimental
approaches. Nat Rev Genet 2007;8:450–461.
58. Wernicke S, Rasche F. FANMOD: a tool for fast network
motif detection. Bioinformatics 2006;22:1152–1153.
59. Schreiber F, Schwobbermeyer H. MAVisto: a tool for the
exploration of network motifs. Bioinformatics 2005;21:
3572–3574.
60. Kim MS, Kim JR, Cho KH. Dynamic network rewiring
determines temporal regulatory functions in Drosophila melanogaster development processes. Bioessays 2010;32:505–513.
61. Mitrophanov AY, Groisman EA. Positive feedback in cellular control systems. Bioessays 2008;30:542–555.
62. Park S, Yang JS, Shin YE, et al. Protein localization as a
principal feature of the etiology and comorbidity of genetic
diseases. Mol Syst Biol 2011;7:494.
63. King BR, Guda C. ngLOC: an n-gram-based Bayesian
method for estimating the subcellular proteomes of eukaryotes. Genome Biol 2007;8:R68.
64. Zhang S, Xia X, Shen J, et al. DBMLoc: a Database of
proteins with multiple subcellular localizations. BMC
Bioinformatics 2008;9:127.
65. Garcia AV, Blanvillain-Baufume S, Huibers RP, et al.
Balanced nuclear and cytoplasmic activities of EDS1 are
required for a complete plant innate immune response.
PLoS Pathog 2010;6:e1000970.
66. Wu G, Spalding EP. Separate functions for nuclear and
cytoplasmic cryptochrome 1 during photomorphogenesis
of Arabidopsis seedlings. Proc Natl Acad Sci USA 2007;104:
18813–18818.
67. Kau TR, Way JC, Silver PA. Nuclear transport and cancer:
from mechanism to intervention. Nat Rev Cancer 2004;4:
106–117.
68. Heazlewood JL, Verboom RE, Tonti-Filippini J, et al.
SUBA: the Arabidopsis subcellular database. Nucleic Acids
Res 2007;35:D213–D218.
69. Pagliarini DJ, Calvo SE, Chang B, et al. A mitochondrial
protein compendium elucidates complex I disease biology.
Cell 2008;134:112–123.
70. Sprenger J, Lynn Fink J, Karunaratne S, et al. LOCATE: a
mammalian protein subcellular localization database. Nucleic
Acids Res 2008;36:D230–D233.
71. Wiwatwattana N, Kumar A. Organelle DB: a cross-species
database of protein localization and function. Nucleic Acids
Res 2005;33:D598–D604.
72. Bendtsen JD, Nielsen H, von Heijne G, et al. Improved
prediction of signal peptides: SignalP 3.0. J Mol Biol 2004;
340:783–795.
73. Petsalaki EI, Bagos PG, Litou ZI, etal. PredSL: a tool for the
N-terminal sequence-based prediction of protein subcellular
localization. Genomics Proteomics Bioinformatics 2006;4:48–55.
74. Small I, Peeters N, Legeai F, et al. Predotar: A tool for rapidly screening proteomes for N-terminal targeting
sequences. Proteomics 2004;4:1581–1590.
75. Huang WL, Tung CW, Ho SW, etal. ProLoc-GO: utilizing
informative Gene Ontology terms for sequence-based prediction of protein subcellular localization. BMC
Bioinformatics 2008;9:80.
76. Bhasin M, Garg A, Raghava GP. PSLpred: prediction of
subcellular localization of bacterial proteins. Bioinformatics
2005;21:2522–2524.
77. Hoglund A, Donnes P, Blum T, et al. MultiLoc: prediction
of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition. Bioinformatics 2006;22:1158–1165.
78. Horton P, Park KJ, Obayashi T, et al. WoLF PSORT: protein localization predictor. Nucleic Acids Res 2007;35:W585–
W587.
79. Park S, Yang JS, Jang SK, et al. Construction of functional
interaction networks through consensus localization predictions of the human proteome. J Proteome Res 2009;8:
3367–3376.
80. Santos SD, Wollman R, Meyer T, et al. Spatial positive
feedback at the onset of mitosis. Cell 2012;149:1500–1513.
81. Terry AJ, Chaplain MAJ. Spatio-temporal modelling of the
NF-kappa B intracellular signalling pathway: the roles of
diffusion, active transport, and cell geometry. J Theor Biol
2011;290:7–26.
82. Meyer PE, Lafitte F, Bontempi G. minet: A R/Bioconductor
package for inferring large transcriptional networks using
mutual information. BMC Bioinformatics 2008;9:461.