生命情報学基礎論 - Kyoto University Bioinformatics Center

九大数理集中講義
Comparison, Analysis, and Control of
Biological Networks (1)
Scale-free Networks
Tatsuya Akutsu
Bioinformatics Center
Institute for Chemical Research
Kyoto University
Contents of Course








Scale-free Networks
Transformation of Scale-free Networks
Domain-based Mathematical Models for Protein Evolution
Boolean Networks: Attractor Detection and Control
Probabilistic Boolean Networks
Control of Complex Networks (数理談話会)
Boolean and Flux Balance Analyses of Metabolic Networks
Comparison of Chemical Graphs
Contents of Lecture (1)





Background
Graphs and Networks
Small World
Scale-free Network
Models of Scale-free Networks



Preferential Attachment
Deterministic Model
Network Motif
Back Ground

Systems Biology


Understanding of cells/organisms as systems
 Inference of networks and interactions
 Computer simulation of cells and organisms
 Stability analysis and control of biological systems
 Experimental verifications
Network Biology



Understanding of cells/organisms as systems
 Small world (1998)
 Scale-free network (1999)
 Network motif (2002)
Analysis of structural features
Analysis of dynamical features
Graphs and Networks
Graphs and Networks

Graph


Fundamental concept in discrete
mathematics and computer science
Consisting of nodes and edges





node ⇔ object (e.g., chemical
compound)
edge ⇔ relation between two objects
(e.g., chemical reaction)
Undirected graph: edge does not
have direction
Directed graph: edge has direction
Network


無向グラフ graph
Undirected
Edges with meaning and/or weights
We do not distinguish graphs from
networks in this lecture
有向グラフ graph
Directed
Graphs and Biological Networks
Metabolic network (KEGG)
Graph
・nodes and edges
Graphs and Real Networks

Metabolic network


Protein-protein interaction (PPI) network


node ⇔ gene, edge ⇔ gene regulation
WWW


node ⇔ protein, edge ⇔ interaction
Genetic network


node ⇔ chemical compound, edge ⇔ reaction
node ⇔ WEB page, edge ⇔ link
Researchers’ network

node ⇔ researcher, edge ⇔ existence of joint paper
Small World
Distance between Nodes

Path


F
#edges
B
E
Length of the shortest path
between two nodes
I
C
Examples of paths between A and E




A
Distance between nodes


G
Length of path


Sequence of edges connecting
two nodes
H
Path 1: (A,G), (G,B), (B,F) ,(F,E) ⇒ length=4
Path 2: (A,G), (G,F), (F,E)
⇒ length=3
Path 3: (A,B), (B,E)
⇒ length=2
dist(A,E)=2 (dist(A,I)=3, dist(C,H)=3)
D
Cluster Coefficient

Cluster coefficient
2mi
Ci 
ki (ki  1)

i
mi :#edges among
neighboring nodes of node i


Ci = 1
ki : degree of node i
Measure of modularity


Ci ≒ 1 ⇔ like Clique
mi is at most k ( k  1)
i
i
i
2
Ci = 0
Small World



Graph with short average
distance (O(log n)以下) and
large average clustering
coefficient
It is reported that many real
networks have small-world
property
Average distance of WWW
⇒ around 19 (Albert al., Nature,
H
G
A
F
B
E
I
C
D
1999)
Ave. dist ≦ 3
Scale-free Network
Scale-free Network: Definition

Degree of node


degree=5
次数=5
P(k)



#edges connecting
to node
Degree distribution
Frequency of nodes
with degree k
次数=2
degree=2
Scale-free network

P(k) follows
(approximately) a
power-law
P( k )  k

degree=3
次数=3
Degree Distribution: Example
A



D
F
G
H
I
J
Degree 1: J
Degree 2: B, C, D, F, G, H
Degree 3: A, E, I
Degree distibution: P(k)

C
E
Degree


B
P(1)=0.1, P(2)=0.6, P(3)=0.3, P(4)=P(5)=P(6)=…=0
Degree Distribution in Scale-free Network
次数=5
次数=2
#nodes
#nodes ∝ (degree)-3
degree
次数=3
Features of Scale-free Network

Def.: P(k) follows a power law ( P(k )  k  )

Big difference from random (Erdos-Renyi) graph
(with Poisson distribution: e-λλk/k!)

Existence of hubs (nodes with large degree)
Hubs often play important roles

k –γ in real networks






PPI: γ≒2.2 (depending on organisms)
Metabolic: γ≒2.24 (depending on organisms)
Movie stars:γ≒2.3
WWW:γ≒2.1
Power grid: γ≒4 (or, not scale-free ?)
Poisson Distribution vs. Power-law Distribution
Power-law
(Scale-free graph)
P (k)
log P (k)
Poisson Distribution
(Random graph)
k
log(k)
Analysis of PPI Network (Yeast)

PPI (protein-protein interaction) network
follows power-law



Nodes with degree ≦ 5 (93%)


node: protein
edge: interaction
Around 21% are essential (lethal)
Nodes with degree ≧16 (0.7%)


Around 62% are essential
Referred as Hubs many of which play important
roles
[Jeong et al., Nature 411:41-42, 2001]
Models of Scale-free
Networks
Growth and Preferential Attachment Model

Growth and preferential attachment
[Barabasi & Albert 1999]
Also referred as Rich-get-richer Model
Method(yielding a network with P(k) ∝ k -3 )
 Construct a complete graph with m0 nodes
 Repeat the following
 Add a new node v to current graph
 Add edges between v and m nodes in current graph,
where each node is selected with probability
proportional its degree (i.e., deg(vi)/(Σj deg(vj)) )



c.f.: construction of random graph


Create all N nodes
Repeat the following

Add an edge between randomly chosen two nodes
(or, Connect two nodes with uniform probability p)
Random Network vs. Scale-free Network
Random Network
Scale-free Network
2/6
2/6
4/14
3/10
3/10
2/6
2/14
4/14
2/10
2/10
2/14
2/14
Analysis by Mean-field Approximation



ki(t): degree of node i (created at time ti) at time t
#edges at time t ≒ mt
ki (t ) mki (t )
Prob. that degree of node i increases at time t t  2mt

By solving this diff. eq. with ki(ti)=m

Suppose network is completed at time tn .
From ki(tn)=k, creation time of node i with
degree k at time tn is given by

t
ki (t )  m 
 ti 
0.5
m 2t n
ti  2
k
2

2
m
tn
Change of ti according to change of k is estimated
3
k
by differentiate the above term
⇒Creeation time changes by 2tnm2k -3 with unit change of k
⇒ #nodes with degree k is approximately 2tnm2k -3
Illustration of Analysis
i=1
i=5
i=2
i=3
t=0
i=4
t=1
t=2
ki(t): degree of node i at time t
t=0
t=1
t=2
t=3
k1(t)
2
3
4
5
k2(t)
2
3
4
4
k3(t)
2
2
2
3
k4(t)
-
2
2
2
k5(t)
-
-
2
2
t=3
Add m edges
at time t
ki (t ) mki (t )

t
2mt
Sum of degrees
ki (t)
k+1
k
m
 2m 2t n
ti 
k3
m 2t n
ti  2
k
tn
t
Analysis by Master Equation
Let p (k , ti , t ) be the probabilit y that node i has degree k at time t.
Then, we have :
p (k , ti , t  1) 
We also have :

p (k )  lim
t 
ti
k 1
k 

p (k  1, ti , t )  1   p (k , ti , t ).
2t
 2t 
p ( k , ti , t )
t
.
Using p (k , t  1, t )  0, we get
k 1 t
k  t

p (k , ti , t  1) 
p (k  1, ti , t )  1   p (k , ti , t ).


2t ti 1
 2t  ti 1
ti 1
t 1
Using P (k )  t 1 p (k , ti , t ) / t etc., we get
t
i
(t  1) p (k ) 
k 1
k 

tp(k  1)  1  tp(k ),
2t
 2t 
k 1
p(k ) 
p (k  1)
k2
Therefore, we have : p (k ) 
(for k  m  1).
const
.
k (k  1)( k  2)
Evolution of
Biological Networks
Preferential attachment mode is reasonable for web
graphs, but not for biological networks
⇒ Duplication and divergence model


Gene duplication (copy of node) + mutation (loss of edge)
Deterministic Scalefree Networks
Hierarchical Scale-Free Network

Hierarchical Scale-Free Network [Ravasz, Barabasi et al. 2002]




Also called as:Deterministic Scale-Free Network
Recursive
construction
Like fractal
For L gons,
P(k)∝
k -1-(ln(L+1)/ln(L))
Analysis of Construction of Hierarchical Network
Degree of hub of level i is
n=1
Hub
of level i=1
i=1のハブ
2  2 2  23    2i
 2i 1  2  2i
n=2
Hub
of level i=2
i=2のハブ
Number of hubs of level i
at step n is
n=3
(2 / 3)3n i 1  3n i
i=2
n=4
By letting k  2i , we have
3
i=1
n i
 3 / 3  3 (k
n
i
Thus, we have γ
n
ln 3
 ln
2
ln 3
ln 2
(Precisely ,γ 1  lnln 32 due to
binning adjustment )
)
L+M Model

Extension of Hierarchical model
Able to construct networks with arbitrary γ (>2)
(vs. γ<2.58 for Hierarchical model

)

ln( L  M )
  1
ln( L)
(L=2)
(M=2)
Nacher et al.,
Physical Review E, 2005
Analysis of L+M Model
Li
Degree of i - th level node is around
The number of i - th level nodes in n - th step is around
( L  M ) n i
By letting
kL ,
i
( L  M ) n i  ( L  M ) n /( L  M ) i  ( L  M ) n (k
Thus, we have γ
 ln (lnLLM )
)
ln ( L  M )
ln ( L )
(Finally, by binning, γ 1 
ln(L  M )
ln ( L )
)
ln( L  M )
  1
ln( L)
Relation between Two Deterministic Models

Hierarchical model corresponds to
the case of M=1 in L+M model
ln( L  M )
ln( L  1)
ln( 3)
  1
 1
 1
 2.58
ln( L)
ln( L)
ln( 2)
L=3, M=1
Network Motif
Network Motif

Sequence Motif



Network Motif




Pattern appearing in sequences with common feature
E.g., L-x(6)-L-x(6)-L-x(6)-L (Leucine Zipper Motif)
Frequently appearing network pattern in given network(s),
compared with randomized networks
Network patterns are usually given by subgraphs
Randomized networks are constructed via random
exchanges of edge pairs
Examples



Feed-forward Loop
Single Input Module
Dense Overlapping regulons
Example of Sequence Motifs
• Zing finger motif
C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H-x(3,5)-H
• Leucine zipper motif
L-x(6)-L-x(6)-L-x(6)-L
Example of Network Motif (1)
Network
Motif
Summary

Graph/Network


Small World


Degree distribution follows a power-law
Models of scale-free networks



Short average distance
Scale-free Network


Defined by nodes and edges
Growth and preferential attachment model
Deterministic model
Network Motif

Frequently appearing small subgraphs