Robustness and network evolution—an entropic principle

ARTICLE IN PRESS
Physica A 346 (2005) 682–696
www.elsevier.com/locate/physa
Robustness and network evolution—an
entropic principle
Lloyd Demetriusa,b, Thomas Mankea,
a
Max-Planck-Institute for Molecular Genetics, Ihnestr. 73, 14195 Berlin, Germany
Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
b
Available online 12 August 2004
Abstract
This article introduces the concept of network entropy as a characteristic measure of
network topology. We provide computational and analytical support for the hypothesis that
network entropy is a quantitative measure of robustness. We formulate an evolutionary model
based on entropy as a selective criterion and show that (a) it predicts the direction of changes
in network structure over evolutionary time and (b) it accounts for the high degree of
robustness and the heterogenous connectivity distribution, which is often observed in
biological and technological networks. Our model is based on Darwinian principles of
evolution and preferentially selects networks according to a global fitness criterion, rather than
local preferences in classical models of network growth. We predict that the evolutionarily
stable states of evolved networks will be characterized by extremal values of network entropy.
r 2004 Elsevier B.V. All rights reserved.
PACS: 89.75.k; 89.75.Fb; 87.23.kg; 89.75.Hc; 89.75.Da
Keywords: Network evolution; Robustness; Evolutionary principle
1. Introduction
Complex systems in nature and technology can be represented by networks, where
the vertices (nodes) denote the basic constituents of the system and links (edges)
Corresponding author.
E-mail addresses: [email protected] (L. Demetrius), [email protected] (T. Manke).
0378-4371/$ - see front matter r 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.physa.2004.07.011
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
683
describe their relationship or interaction. The recent progress in biological sciences
has highlighted the pervasiveness of molecular networks which control the
information flow and regulation of signals in the cell [1].
The interest in understanding the relation between the structure and the
function of these networks, has generated a variety of new developments in both
empirical and analytical studies. The empirical work is driven by efforts to integrate
the enormous amount of relational data emerging from many large-scale
experiments in functional genomics, such as protein–protein interactions, protein–DNA interactions and global gene expression studies. Although such data is
often error-prone, it is free from the traditional bias of hypothesis-driven
experiments and lends itself to global network mapping projects. It is hoped that
in combination these studies will ultimately provide a coherent picture of cellular
processes [2].
Theoretical analysis, on the other hand, is inspired by efforts to elucidate
the relation between network structure and its behavioural properties, and to
explain these relations in terms of an evolutionary process. In this context,
protein interaction networks have attracted particular attention as they
provide the backbone along which various biological signals can propagate in
response to environmental stimuli. They also share many characteristics
with other evolved networks: a heterogenous connectivity distribution, a large
degree of clustering, and the capacity to remain functional in the face of
random perturbations. This latter property is referred to as robustness—a common
feature which is illustrated by experimental perturbation studies in yeast [3] and by
computational analysis of network observables under node deletion [4].
These authors also found an indication for a correlation in the lethality of a
gene deletion with the centrality of the corresponding protein [5]. Robustness
of molecular networks, in view of its relevance to the reliability of
intracellular processing and the viability of the organism, has emerged as a
fundamental concept in the study of the behavioural properties of biological
networks [6].
The seminal work by Albert et al. [4] has generated considerable activity in
attempts to characterize robustness in terms of network topology and to analyse its
evolutionary origin. These authors demonstrated that many evolved networks do
indeed possess a larger degree of robustness under random deletion of nodes than
random graph models. Aldana and Cluzel [7] made the same observation considering
dynamical changes in a network model with heterogenous connectivity distribution.
Other works have gone beyond the degree distribution and also investigated degree
correlations in the light of their impact on network robustness [8], and the role of
higher order network structures such as cycles [9]. These studies, however, have not
led to any quantitative measures relating network topology to the observed degree of
resilience.
In Section 2 of this paper we give a structural characterization of robustness,
by proposing a novel representation of network topology in terms of network
entropy, a structural property of the network. This concept has its origin in
the ergodic theory of dynamical systems. Entropy in this context is a
ARTICLE IN PRESS
684
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
fundamental statistical property (Kolmogorov–Sinai invariant), and it completely
characterizes the ergodic behaviour of the dynamical system [10]. The representation
of network topology in terms of the entropy of a dynamical system draws from a
variational principle [11]. The characterization of robustness in terms of entropy
appeals to a recent application of large deviation theory to dynamical systems [12].
These authors derived a fluctuation theorem, which states that network entropy and
stability, as measured by the fluctuation decay rate after random perturbations, are
positively correlated. We invoke this theorem and a set of computational studies to
show that network entropy is a quantitative descriptor of the homeostatic network
properties under random perturbations, a generic term for robustness.
Studies of network evolution have also been driven by the influential work of
Barabasi and Albert [13]. These authors considered network evolution as a growth
process with preferential attachment mechanisms, according to which new nodes are
linked to the existing network based on the node degree as a local criterion. Two
important new developments involving global criteria were recently introduced by
Colizza et al. [14] and Pastor-Satorras et al. [15]. These authors define a cost function
based on the shortest paths length and clustering properties of the network as a
selective criterion to be optimized. In Section 3 we present a new model which differs
from these two pioneering studies by appealing to network entropy as the
fundamental selective criterion. There we embody explicitly the mechanism
underlying biological evolution and describe network evolution as a Darwinian
process, in which variation occurs at the molecular level of network changes, and
selection derives from competition between organisms which carry molecular
networks with varying degree of robustness [16]. The outcome of such competition
is modulated by environmental constraints and the Darwinian fitness. Fitness, in this
context, describes the ability of a population to withstand fluctuations in the
demographic variables—a property which can be measured by the demographic
robustness of a population of replicating organisms [12]. We will invoke the
hypothesis that robustness of the molecular network is positively correlated with
demographic robustness. Appealing to this hypothesis and the relation between
network entropy and robustness, we are able to study changes in network topology
under different environmental constraints. As pointed out by Sole et al. [17],
technological evolution may also be described in terms of variation as the result of
innovation, and selection as the result of competition between networks for users
(e.g. in communication networks). If we postulate that the resilience of technological
networks to random perturbations is a charactersitic index of their competitive
ability, then our evolution model applies equally well to such technological
networks.
In replacing the local attachment mechanism proposed in Ref. [13] by a global
selection principle, we find that such a framework subsumes earlier models of
network evolution and can generate a large diversity of commonly studied network
architectures. Moreover, we can specify the topological structures which correspond
to evolutionary stable networks, that is networks, which are optimally adapted to the
environmental conditions, such that changes in their structure will not increase their
selective advantage.
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
685
2. Network entropy and robustness
In this section we will define network entropy and present evidence that this
quantity is related to the capacity of the network to withstand random changes in the
network structure.
We represent molecular entities and their interactions as a graph with N
nodes (e.g. proteins) and M links to record an established physical or genetic
interaction. Increasingly, such data derives from recent large-scale experiments
for protein–protein and protein–DNA interactions, but it may also represent the
accumulated knowledge from many years of focussed research, as is the
case for metabolic networks in several different organisms [18]. The topological
structure of the graph can be described by an N N adjacency matrix A ¼ ðaij ÞX0;
which is typically sparse (non-zero for only M5N 2 links). In the case of
undirected and unweighted links the adjacency matrix is symmetric and
Boolean (aij ¼ aji 2 f0; 1gÞ: We will use the term graph also for its adjacency
matrix.
Network entropy. We will now appeal to certain ideas from ergodic theory and
statistical mechanics to characterize the structural properties of the graph in terms of
a function of number of nodes and directed links between adjacent nodes. This
function is called entropy on account of the formal similarities with various entropic
concepts which arise in ergodic theory and statistical mechanics. In the following we
will utilize the Kolmogorov–Sinai (KS) entropy, which is a generalization of the
Shannon entropy in that it describes the rate at which a stochastic process generates
information [10]. In our context, information corresponds to a sequence of nodes
visited by an assumed Markov process on the network. The fundamental importance
of the KS-entropy for ergodic theory is its invariance under transformations which
preserve the frequencies with which the network generates time-ordered sequences of
nodes.
We now assume that the stochastic process which defines the information source is
given by a Markov
matrix P ¼ ðpij Þ: It describes the transition rates from state i ! j
P
(pij X0 and j pij ¼ 1) and its stationary distribution, p ¼ pP: The dynamical entropy
of this process, HðPÞ; is defined as
HðPÞ ¼
N
X
pi H i ;
i¼1
where H i ¼ X
pij log pij :
ð1Þ
j
Here H i is the Shannon entropy of the distribution ½pi1 ; . . . ; piN and H is the
weighted average over all stationary states.
Network entropy is the entropy of a stochastic matrix associated with the
adjacency matrix A ¼ ðaij Þ: The particular matrix we consider is specified in terms of
a variational principle. Let l denote the dominant eigenvalue of A and let ðvi Þ be the
corresponding leading eigenvector. Furthermore, consider the set M A of all
stochastic matrices which satisfy the property that
aij ¼ 02pij ¼ 0 :
ð2Þ
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
686
It was shown in Ref. [11] that log l satisfies a variational principle (the analogue of
the Gibbs variational principle in statistical mechanics)
"
#
X
X
log l ¼ supP2M A pi pij log pij þ
pi pij log aij ;
ð3Þ
ij
i;j
and that the supremum over all possible stochastic matrices is attained for the unique
stochastic matrix P ¼ ðpij Þ defined by
aij vj
pij ¼
:
ð4Þ
lvi
Network entropy is defined as in Eq. (1) with this particular definition for P, in
which case Eq. (3) reduces to the identity
X
X
log l ¼ pi pij log pij þ
pi pij log aij :
ð5Þ
ij
i;j
In the case of a Boolean adjacency matrix the second term in Eq. (5) vanishes and we
have H ¼ log l; which is sometimes called the topological entropy, as it does not
involve the transition rates. In Fig. 1 we illustrate the rationale for this term by
showing four canonical networks with the same number of nodes (N ¼ 100) and
edges (M ¼ 200), but with different topological entropies owing to their very
different structures. These constructed networks were chosen for their apparent
differences in the degree distribution. We want to stress though that the entropy
defined by Eqs. (1) and (4) is distinct from the entropy of the degree distribution (a
measure of degree heterogenity). Since network entropy also characterizes the
multiplicity of internal pathways, it is negatively correlated with the shortest average
path length.
Robustness. The property robustness pertains to the insensitivity of measurable
parameters of the system to changes in its internal organization. Empirical studies of
this phenomenon distinguish between two types of robustness—dynamical and
topological. Dynamical robustness refers to the insensitivity of measurable
parameters of the network to dynamical changes in the individual variables.
H = 2.00
l = 12.88
H = 2.26
l = 3.46
H = 2.88
l = 3.01
H = 3.94
l = 1.96
Fig. 1. The topological entropy, H, depends on the combination of several network features such as the
degree distribution and the average shortest path length, l. In this illustration, all networks have the same
number of nodes (N ¼ 100) and edges (M ¼ 200).
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
687
Topological robustness describes the insensitivity of observables of the network to
structural or topological changes in the individual variables or components. In
analytical studies of dynamical systems, robustness is generally quantified as the
response of some observable to changes in the underlying parameters. Attempts to
quantify this property have traditionally studied the behaviour of certain global
quantities under removal of a fraction p of nodes (or edges) [4] or investigated the
properties of simple dynamic models on the network [7,9].
We will now appeal to some recent studies based on large deviation theory and
dynamical systems to propose an analytical characterization of robustness, which
captures both dynamical and topological features. Robustness can be quantified by
analysing deviations of observables in a dynamical systems following changes in the
network parameters. This can be formally described as follows: Consider a
perturbation in some kinetic reaction or topological perturbations due to changes
in the network structure. Such changes will generally result in deviations of an
observable (e.g. activity), from its unperturbed value. Let P ðtÞ denote the
probability that the sample mean deviates by more than from its unperturbed
value at time t. As t increases, P ðtÞ converges to zero and we define the fluctuation
decay rate, R, as the rate of this convergence on a logarithmic scale:
1
R ¼ lim log P ðtÞ :
ð6Þ
t!1
t
Large values of R entail small deviations of observables from the steady-state
condition and small values of R correspond to large fluctuations around its mean
value. Thus, R characterizes the insensitivity of an observable in the face of
structural and dynamic changes in the underlying parameters.
The fluctuation theorem [12], asserts that R is positively correlated with network
entropy defined by
X
H¼
pi pij log pij ;
ð7Þ
i;j
and pij as in Eq. (4). Analytically, we write
DHDR40 ;
ð8Þ
where DH and DR describe changes in the variables H and R, which result from a
change in the parameters that describe the network. The fluctuation decay rate R is a
non-linear property derived from the interactions between the elements that define
the network when the system is in the neighbourhood of a steady-state condition.
The entropy is a macroscopic variable defined at steady state. Hence Eq. (8)
characterizes a non-linear phenomenon in terms of an equilibrium property which
can be described by an operationally measurable property.
The fluctuation theorem described by Eq. (8) is a member of the family of
fluctuation-dissipation theorems which have their origin in the Green–Kubo
formula. This class of theorems connect non-equilibrium behaviour, a perturbed
system relaxing back to equilibrium to a function that can be calculated for the
equilibrium state. In the case of the Green–Kubo formula this function is the
ARTICLE IN PRESS
688
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
correlation function, in the fluctuation theorem the function is network entropy. The
entropic fluctuation theorem implies that an increase in entropy entails an increase in
robustness and hence a greater insensitivity of an observable to dynamic or
structural perturbations of the network. As the entropy can be easily calculated for
any network it will serve us as a convenient proxy for robustness.
To provide computational support for Eq. (8) we study the process of network
disintegration under random node removal for three classes of networks with
different topological entropy:
scale-free networks with a heterogenous connectivity distribution which can be
described by a power-law. To be specific we use the Barabasi–Albert (BA) model
of network growth and preferential attachment where each new node enters the
network with two new links [13].
random graph models, where the node degrees follow a Poisson distribution. Here
we use the standard Erdös–Renyi construction of an equilibrium graph with fixed
number of nodes and edges [19].
regular networks, where all nodes have precisely the same degree, but the topology
is random otherwise. These networks were constructed by choosing for each node
a fixed number of random neighbours, until all nodes have the same degree.
To be comparable all the networks were chosen to have the same number of
nodes (N ¼ 3500) and edges (M ¼ 7000). The results of our analysis are presented in
Fig. 2. There we look at the shortest distances (path lengths) between any two nodes
in the network. Generally, the path lengths in the largest connected component will
increase as more and more nodes are removed from the system. Furthermore, there
Fig. 2. This figure illustrates the different degree of robustness for different network architectures
(N ¼ 3500; M ¼ 7000) under random node removal. The behaviour of the average shortest path length, l,
is markedly different for regular networks (black line), Erdös–Renyi graphs (red) and scale-free networks
(BA-model, blue). In the legend we also give the values of network entropy, H. It is apparent that H is a
convenient measure to reflect the different degree of robustness under random perturbation.
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
689
will be a fragmentation point (the percolation transition), beyond which the average
shortest path length, l, drops sharply as the largest (giant) component dissolves into
many small components [20].
Scale-free networks have been previously called robust as they have a weak
dependence of l on p and the fragmentation point is shifted to higher values [4]. This
is reflected by a large entropy in our formalism. Random graphs have a smaller
entropy and fragment more easily, while the minimal entropy is realized by regular
networks, which disintegrates most rapidly under random attack. We would like to
emphasize that robustness under random node removal entails vulnerability under
targeted attack on the highly connected nodes (hubs). For finite networks, the
resulting fluctuations in l (due to hub removal) can be large. For regular networks,
where all nodes have the same degree, the distinction between random and targeted
attack is absent and they disintegrate at almost the same rate.
3. Evolution of networks
In the previous section we introduced network entropy to quantify the robustness
of networks. We will now study the evolution of these networks. In the model we
propose, we emphasize that evolution is a two level process involving variation and
selection in the context of a given environment. In technological networks, variation
can be understood as innovation and selection pressures arise as the result of
competition for new users. One may postulate that the resilience of such networks
will be the determining factor in deciding the outcome of such competition. In
cellular networks, variation occurs at the molecular level (e.g. mutations), and
selection takes place at the organismic level. Therefore, cellular networks (e.g.
protein interaction networks) evolve only in so far as they confer a selective
advantage to the organisms that carry them. Thus their evolution is constrained by
the competitive interaction between organisms for resources.
Evolution at the demographic level has been analysed by [12]. The models studied
show that the outcome of competition between ancestral and mutant organisms is
determined by the capacity of the population to maintain steady-state population
numbers under perturbation of birth and death rates. This property is called
demographic robustness, a condition which can be quantified in terms of the entropy
of the demographic network. Demetrius et al. [12] have demonstrated that, for large
population size, the competitive outcome is predicted by demographic robustness
and is contingent on environmental constraints: under bounded growth conditions
the demographically more robust populations will prevail, and under unbounded
growth conditions the less robust will replace the more robust. In our study of the
evolution of molecular networks, we postulate that changes in robustness of the
demographic network are positively correlated with changes in the robustness of the
molecular network. Keeping these postulates in mind we will study evolutionary
changes in the network structures using robustness as the selection criterion for both
molecular and technological networks. Robustness, in turn, we will quantify by its
structural correlate network entropy.
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
690
Fig. 3. Here we illustrate our network model as the combination of an evolutionary process (e.g. growth
by node addition) and a selection process acting globally on an ensemble of networks. The probability,
PðHÞ; with which a certain network is chosen for further evolution depends on the value of its entropy H.
The new network model which we propose incorporates the two fundamental
mechanisms of (1) variation and (2) selection. For the purpose of this work, we
formulate variation as a simple growth process which generates a whole ensemble of
new networks from a single ancestral network. Selection takes place at the
macroscopic level where a particular network from the ensemble is chosen based
on its global property ‘‘robustness’’ which is quantified by H. This process is iterated
over time. A realization at a given time is illustrated in Fig. 3.
To be specific, we connect a newly added node in all possible ways to the ancestral
network, thereby generating an ensemble of N new networks with entropies
H min ¼ H 1 pH 2 p pH N ¼ H max :
ð9Þ
For convenience, we further normalize those values
dðH i Þ H i H min
;
H max H min
0pdðH i Þp1 :
ð10Þ
Rather than selecting precisely the network with maximal entropy, dðH max Þ ¼ 1; we
invoke a probabilistic notion, in which networks are selected preferentially, but not
strictly, according to the measure of robustness as described by the entropy H. We
define the probability of selecting network i as
(
dðH i ÞT
for TX0
PðH i Þ /
;
ð11Þ
T
ð1 dðH i ÞÞ
for To0
in analogy to the generic models of preferential attachment. Notice, however, the
crucial difference in our choice of the selection variable: rather than using a local
variable (node degree) we employ a global measure (network entropy)—and instead
of preferential attachment our model invokes the notion of preferential selection.
PðHÞ defines a probability distribution from which a network is selected. The
parameter T determines, how strictly the maximal entropic principle is enforced, or
whether (in the case of To0) smaller entropies are favoured.
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
691
The simple growth process described above, is easily generalized to situations
where a node enters with m edges at a time, or where a ‘‘new’’ node enters the system
as a result of multiplication and diversification processes [21]. In following section we
will focus on the question which network topologies will arise under different
selective pressures (as quantified by the parameter T).
4. Results
The network model introduced above has a free parameter, T, which is ultimately
determined by different environmental conditions. As such, T will vary over
evolutionary time and reflect the extent to which selection-of-the-robust is enforced.
We do not claim that this is always the case. In fact, there will be situations in which
the Darwinian fitness of an organism is determined by its ability to explore a wide
variety of dynamical responses.
Interestingly, we observed a range of different topologies associated with different
parameter regions, which is summarized in Table 1. There we study the behaviour of
network charactersitics (distances, degree distribution) as the networks evolves
(grows) up to a given size.
For negative T, networks with smaller entropies are selected preferentially and
robustness is selected against (Fig. 1a–c of Table 1). As a result the evolving
networks show a more and more peaked degree distribution, which approaches
that of a regular graph, where all nodes have the same degree. This limit is
characterized by constant topological entropy for all network sizes (red curve in
Fig. 1a of Table 1). The entropy of the degree distribution pk vanishes for regular
lattices,1 but for finite T and finite N it decreases linearly with N (black curve) due
to a small number of nodes with different degree from the majority. This reflects
the presence of shortcuts, which are also the cause of sudden declines in the
average path length—a quantity which otherwise increases linearly with the
network size (blue curve).
For T ¼ 0 there is no selective force and all networks are chosen with equal
probability (Fig. 2a–c in Table 1). Notice that the resulting networks are not the
same as the classical Erdös–Renyi graphs. For the latter we have an equilibrium
ensemble with fixed number of nodes and edges, while we model a nonequilibrium process where those numbers grow continuously. Therefore, in this
model, there is still a notion of time, and older nodes can be distinguished from
younger nodes as having a higher degree on average. The observed degree
distribution is not peaked, but falls off exponentially (Fig. 2b). In contrast to
regular networks, the distances in such randomly evolved networks are small,
corresponding to an only logarithmic increase of the average shortest path length
with the network size N (blue curve in Fig. 2a).
1
P
H deg ¼ k pk log pk :
ARTICLE IN PRESS
692
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
Table 1
In this table we illustrate how different graph topologies emerge from different parameters T, which
determines how strictly the extremal entropy principle is enforced
The first row shows the behaviour of several network quantities as function of time (/ number of nodes
N): The topological entropy (red circles) remains constant for large negative T and increases as a power of
the network size for large positive T. The average shortest path length (blue) shows the opposite
behaviour. The entropy of the degree distribution is shown in black, while the second row shows the degree
distribution explicitly for N ¼ 1000 and M ¼ 2000: The corresponding network of this size is visualized in
the bottom row. Here the nodes are coloured differently according to their different degree and edges
interpolate the colours of the neighbouring nodes. The precise values for T used in these four simulations
are T ¼ ð10000; 0; 0:25; 1Þ:
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
693
For positive T, networks with large entropy are selected preferentially and the
emerging networks are more robust by construction. This is accompanied by a
sharp drop in the average shortest path length and the emergence of a
heterogenous degree distribution, which (for large N) is well-approximated by a
power law with T-dependent scaling coefficient gðTÞ: Small networks evolve as
stars, where all nodes are connected to a central hub.2 As the network grows
bigger, the probability for non-central connections also rises. Beyond a certain
size, N c ðTÞ; such connections will occur by chance and they will give rise to a
scale-free degree distribution. In Table 1 this transition is marked by a cusp in Fig.
3a. Star-like structures can also emerge in traditional models of attachment and
are often described as a winner-takes-all regime. We want to highlight that in our
model it is not the attachment process (which we think of as totally random), but
the entropic selection process which drives network evolution and may appear as
if preferential attachment was at work.
5. Conclusions
Recent studies of biological and technological networks are based on the
observation that most evolved networks cannot be modelled adequately as random
graphs and show a high degree of resilience against perturbations. This has lead to
the questions (a) whether common network topologies can be described in terms of a
universal evolutionary principle [17] and (b) how robustness can be quantified in
terms of structural network properties.
In this work we addressed these points and provided a novel characterization of
network topology in terms of network entropy. This concept is derived from the
ergodic theory of dynamical systems. The importance of entropy—and its
applicability to network theory—rests on three fundamental properties which we
have elaborated in this paper:
1. Network entropy is an invariant of the dynamical system. It characterizes the
structure and the ergodic behaviour of a dynamical system operating on the
network.
2. Network entropy is positively correlated with robustness.
3. Evolutionarily stable states are characterized by extremal values of network
entropy. Maximal values of entropy arise where evolution increases robustness,
minimal values of entropy arise where evolution decreases robustness.
In this respect network entropy, as introduced in this paper, is quite distinct from the
entropy of the degree distribution which was recently used to classify networks [15].3
2
In our examples, the initial network consists of two connected nodes, in which case precisely two hubs
will emerge.
3
Their work also differs from ours in that these authors studied equilibrium networks, which—in
contrast to our evolutionary model—are not selected according to their robustness, but optimized
according to a different cost function.
ARTICLE IN PRESS
694
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
We proposed a model which selects networks preferentially according to their
resilience against random perturbations. For technological networks we assume that
robustness is under direct selective pressure, while for biological networks we invoke
the notion that demographic robustness, contingent on environmental conditions, is
the selective criterion. More or less robust network topologies appear as a byproduct
of Darwinian evolution acting on populations of organisms.
Our work also brings together two approaches in the study of network evolution.
On the one hand, several groups have invoked robustness as a guiding principle of
evolution to formulate network selection based on the dynamical properties of
simple dynamical systems on graphs [6,7,9]. On the other hand, phenomenological
models of network growth have been shown to produce robust systems, as reviewed
in Ref. [22]. However, the proposed growth mechanisms (e.g. preferential
attachment) do not capture the nature of biological evolution.
Here we extended these ideas and allowed for direct optimization of robustness
with different stringency under different environmental conditions. Our results
demonstrate how heuristic models can be understood as an effective description of
the selective process at the organismic level. The global preference for more or less
robust system directly encodes the topological structures we observe in evolved
networks.
Formulating evolution as a two-step process of variation and selection, our work
also suggests future improvements in the description of each step. As a model of
variation, we considered only a simple growth process of node addition at each time
step. At the expense of extra parameters, this could be extended to allow also for
node loss, variable network growth rates or more detailed models of duplication and
diversification events (in the context of proteome evolution). Moreover, one may
introduce variation processes without growth, for example through rewiring.
For the selection process, we utilized an entropic framework with only one
parameter, T, which determines the degree to which selection favours robust systems.
In our computational study we fixed this parameter for all evolutionary times,
corresponding to an idealized situation in which the environmental conditions are
fixed. In more realistic scenarios one should allow for possible variations in T,
corresponding to changes in the environmental constraints. Therefore we would
expect real networks to be mixtures of the limiting cases described in Section 4. In
this study we also limited ourselves to undirected graphs with constant edge weight,
since much of the experimental data does not give a quantitative account for the
observed relations. However, our approach is not limited to this case and can be
readily applied to weighted graphs should such information be available.
Apart from extending our framework to more realistic scenarios, which take into
account the specificities of a given network (biological, social or technological), we
are already able to address in a new light a number of interesting questions which are
frequently asked about networks:
(1) Structural questions. What are the important elements in complex networks ?
Rather than ranking network elements according to their degree, clustering
coefficient or characteristic path lengths, we propose to use their contribution to
the network entropy as a ranking principle which will allow to identify key elements.
ARTICLE IN PRESS
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
695
Equivalently, we may ask the question how certain structural changes will affect the
network entropy. This can be directly compared to a wealth of experimental data
which is emerging for several biological networks.
(2) Evolutionary questions. What is the degree of evolvability of the network ? If
evolution optimizes a certain cost function on networks, it makes sense to compare
different evolved networks according to their distance from the evolutinarily stable
state. Since these states are characterized by extremal values of entropy, it provides a
suitable measure for such a classification.
(3) Network comparisons. In as much as more reliable network data will become
available, our model can be directly tested: network comparisons between different
organisms can reveal the different environmental conditions under which the
organisms have evolved.
Acknowledgements
We would like to thank Martin Vingron and Kim Sneppen for helpful discussions.
We are also indepted to Andrea Rinaldo for enlightment on several issues of network
biology. T.M. acknowldeges funding by European Community Contract No. QLRICT-2001-00015 for ‘‘TEMBLOR’’ under the specific RTD programme ‘‘Quality of
Life and Management of Living Resources’’.
References
[1] E. Alm, A.P. Arkin, Biological networks, Curr. Opin. Struct. Biol. 13 (2) (2003) 193–202.
[2] M. Vidal, A biological atlas of functional maps, Cell 104 (3) (2001) 333–339.
[3] T.R. Hughes, M.J. Marton, A.R. Jones, C.J. Roberts, R. Stoughton, C.D. Armour, H.A. Bennett,
E. Coffey, H. Dai, Y.D. He, M.J. Kidd, A.M. King, M.R. Meyer, D. Slade, P.Y. Lum, S.B.
Stepaniants, D.D. Shoemaker, D. Gachotte, K. Chakraburtty, J. Simon, M. Bard, S.H. Friend,
Functional discovery via a compendium of expression profiles, Cell 102 (1) (2000) 109–126.
[4] R. Albert, H. Jeong, A.L. Barabasi, Error and attack tolerance of complex networks, Nature 406
(6794) (2000) 378–382.
[5] H. Jeong, S.P. Mason, A.L. Barabasi, Z.N. Oltvai, Lethality and centrality in protein networks,
Nature 411 (6833) (2001) 41–42.
[6] S. Bornholdt, K. Sneppen, Robustness as an evolutionary principle, Proc. Roy. Soc. London B 267
(1459) (2000) 2281–2286.
[7] M. Aldana, P. Cluzel, A natural class of robust networks, Proc. Natl. Acad. Sci. U.S.A. 100 (15)
(2003) 8710–8714.
[8] S. Maslov, K. Sneppen, Specificity and stability in topology of protein networks, Science 296 (5569)
(2002) 910–913.
[9] S. Jain, S. Krishna, A model for the emergence of cooperation interdependence, and structure in
evolving networks, Proc. Natl. Acad. Sci. U.S.A. 98 (2) (2001) 543–547.
[10] P. Billingsley, Ergodic Theory and Information, Wiley, New York, 1965.
[11] L. Arnold, V. Gundlach, L. Demetrius, Evolutionary formalism for products of positive random
matrices, Ann. Probab. 4 (3) (1994) 859–901.
[12] L. Demetrius, V. Gundlach, G. Ochs, Complexity and demographic stability in population models,
Theoret. Population Biol. 65 (3) (2004) 211–225.
ARTICLE IN PRESS
696
L. Demetrius, T. Manke / Physica A 346 (2005) 682–696
[13] A.L. Barabasi, R. Albert, Emergence of scaling in random networks, Science 286 (5439) (1999)
509–512.
[14] V. Colizza, J. Banavar, A. Maritan, A. Rinaldo, Network structures from selection principles,
Phys. Rev. Lett. 92 (19) (2004) 198701.
[15] R. Pastor-Satorras, M. Rubi, A. Diaz-Guilera, Optimization in complex networks, in: R. Ferrer,
R.V. Sole (Eds.), Statistical Mechanics of Complex Networks, Lecture Notes in Physics, Springer,
Berlin, (2003a), pp. 114–125.
[16] E. Mayr, Populations, Species and Evolution, Harvard University Press, Harvard, 1970.
[17] R.V. Sole, R. Ferrer, S. Valverde, J.M. Montoya, Selection, tinkering and emergence in complex
networks, Complexity 8 (1) (2000) 20–33.
[18] M. Kanehisa, S. Goto, Kegg: kyoto encyclopedia of genes and genomes, Nucleic Acids Res. 28 (1)
(2000) 27–30.
[19] P. Erdös, A. Rényi, On random graphs i, Publ. Math. Debrecen 6 (1959) 290–297.
[20] B. Bollobas, Random Graphs, Academic Press, London, 1985.
[21] R. Pastor-Satorras, E. Smith, R.V. Sole, Evolving protein interaction networks through gene
duplication, J. Theoret. Biol. 222 (2) (2003b) 199–210.
[22] R. Albert, A. Barabasi, Statistical mechanics of complex networks, Rev. Mod. Phys. 74 (2002).