CAP5510 - Bioinformatics

CIS 4930/6930 – Recent
Advances in Bioinformatics
Spring 2014
Network models
Tamer Kahveci
1
Graphs
• Useful for describing networks.
• G = (V, E) with
– V = set of nodes
– E = set of edges
• Topological models
– Directed/Undirected
– Weighted/Unweighted
– Deterministic/Probabilistic (G = (V, E, P))
• Concepts
– Degree (indegree/outdegree), path
2
Topological properties
• Degree distribution, P(k) of
G=(V, E)
– Deg(k) = number of nodes in G
with degree = k.
– P(k) = Deg(k)/|V| = Probability
that a random node in G has
degree = k.
2
3
2
1
H.Pylori
Todor et al. TCBB. 10:4. 2013
3
Topological properties
• Neighbors of node v, N(v) = set of nodes
adjacent to v.
• Clustering coefficient of node v, Cv shows the
connectivity of N(v).
• Slightly different denominator for directed vs
undirected graph
Cv =
2/6
# edges among N(v)
Max # edges possible among N(v)
• C(k) = Average clustering coefficients for all nodes
with k edges.
• Networks clustering coefficient = average clustering
coefficients of all nodes in G = (∑ Cv) / |V|
4
Centrality of a node
• Centrality of a node v in graph G = (V, E)
indicates relative importance of v in G with
respect to the rest of the nodes in G. Lets
denote it with f(v | G) or simply f(v).
• Many centrality measures exists
– Degree centrality
• How popular am I?
• fDeg(v) = Deg(v)
– Closeness centrality
– Betweenness centrality
5
Closeness Centrality
• How close am I to everyone else?
• Given G = (V, E)
• Dist(u,v) = shortest path length from
u to v in G
• fClose(u) = ∑v in G Dist(u, v)
1
1
2
3
• Alternative (for disconnected
networks)
– fClose(u) = ∑v in V-{u} 1/ Dist(u, v)
– 1/inf = 0
• How do I find shortest path?
– Floyd-Warshall algorithm
– Johnson’s algorithm
6
Betweenness Centrality
• How many pairs of nodes use me on the
cheapest route to communicate?
• gst = number of shortest path between s &
and t.
• gst(v) = number of shortest path between s
& and t that contains v.
• fBetween(v) = (∑s,t gst(v)/ gst) / (number of s,t
pairs in V- {v}).
7
Floyd-Warshall: shortest path
k+1
V’ = {1, 2, …, k}
i
j
Given G = (V, E, w)
Distance(i, j, 0) = w(i, j)
Distance(i, j, k+1) = min{Distance(i, j, k), Distance(i, k+1, k) + Distance(k+1, j, k)}
for k = 1 to n do // use node k on path
for i = 1 to n do // origin i
for j = 1 to n do // destination j
if (d[i,k] + d[k,j]) < d[i,j]) {
d[i,j] = d[i,k] + d[k,j] // shorter path length
visit[i,j] = k // new path goes through k
}
8
Key network models
• Erdos-Renyi
• Small world
• Scale free
9
Erdos-Renyi
• Totally uniformly random distribution of edges
• Construction
– Given two parameters (n = # of nodes, p = probability
of an edge existence)
– For all pairs of node (u,v)
• Create an edge (u,v) with probability p.
10
Small World (Watts-Strogatz)
• Everyone tends to be close to each other.
• As the number of nodes (N) in the network
grows, the distance between two random
nodes grows with the logarithm of N.
• Construction
– Given three parameters:
• N = # of nodes.
• K = average degree
• p = rewiring probability
– Construct a ring lattice
• Connect each ith node to nodes {i-1, i-2, …, ik/2} and {i+1, i+2, …, i+k/2} with an edge
…
– For each node u
• For each edge (u, v)
– Randomly pick a node v’ = V-{u}
– Replace (u, v) with (u, v’) with probability p
11
Scale-Free
• A lot of poor work for a few super rich
• Probability that a node has degree k drops exponentially
with k.
– P(k) ~ k-ᵞ
• Construction (preferential attachment – or rich gets richer)
– Given two parameters (n = # of nodes, k = average degree)
– Build a small network (e.g. two nodes and one edge)
– Repeat
• Insert a new node v
• Insert k edges from v to existing nodes. Existing node u gets an edge with
probability pu = Deg(u)/ ∑i Deg(i)
– Until we have n nodes
12
Hierarchical
• Similar to fractals
• Scale-free networks with high
clustering.
• Construction
– Create an initial network (seed) with t
peripheral nodes
– Create t copies of this network and
connect each of them to the central
node.
13
Probabilistic
a
G = (V, E, P)
P: E -> (0, 1]
(1-0.6) x0.28
(1-0.3) = 0.28
0.12
0.3
0.6
b
c
0.42
0.18
0.28 + 0.12 + 0.42 + 0.18 = 1
14