Building Blocks of Biological Networks

Building Blocks of Biological Networks
Papers:
Finding Significant Network Motifs I:
Milo, Shen-Orr, Itzkovitz, Kashtan, Chklovskii, Alon, Science 298, 824 (2002)
Finding Significant Network Motifs II:
Ziv, Koytcheff, Wiggins, cond-mat/0306610 (2003)
Classifying Networks using Motifs:
Middendorf, Ziv, Adams, Hom, Koytcheff, Levovitz, Woods, Chen, Wiggins,
q-bio/0402017 (2004)
What are networks?
●
collection of objects = nodes
●
connected by interactions = edges
Networks are Everywhere:
(from M. Newman website)
Global Properties of Networks:
Small World Networks:
Stanley Milgram, sociologist, sent letters to random people in Nebraska to
see if they might get sent back to a friend in Boston
Many letters found their way back home - “six degrees of separation”
Short paths between any two nodes
Path scaling:
D ~ log(N)
Can you make the
connection?
Types of Networks I:
●
Social Networks, collaboration networks are small-worlds
Erdos-Renyi Graphs:
Random graph with k connections per node (or avg <k>/node)
has <l> ~ log(N) but not well clustered
Watts-Strogatz Model:
locally clustered with random long range connections
(from M. Newman)
small world behaviour
Types of Networks II:
Scale-free Networks:
have highly heterogeneous connectivities – hubs and outliers
probability of node with connectivity, P(k) ~ k -γ
Examples: world-wide web, internet, metabolic networks, protein networks
Hub
Outlier
(from Jeong, Tombor, Albert, Oltvai &
Barabasi, Nature 407, 651 (2000) )
Global Properties of Biological Networks:
(from Jeong et al)
Indep. of N
Metabolic Networks:
scale-free but apparently NOT small-world
archae
eukaryote
prokaryote
Finding the Building Blocks of Networks:
●
Are there small subgraphs that get repeatedly used in larger network?
●
Why are they there? what do they do?
(from Milo et al, Science 298, 824 (2002))
●
●
count all the occurrences of all subgraphs up to size, n=4.
assess significance with respect to randomized version of network
Feed Forward Loop
●
significance is in some sense determined by the random null model !?!
Scoring Motifs:
●
There are 13 directed graphs for n = 3, and 199 graphs for n = 4
●
# of occurrences of subgraph in ‘real’ network = Nreal
Randomization:
●
Graph is randomized by preserving the # of incoming and outgoing edges
for each node
●
Also preserve the number of counts for (n-1) subgraphs in the randomization
●
# of occurrences of subgraph in randomized networks = Nrand +/- σ
Scoring:
●
Score given by Z-score: Z = (Nreal – Nrand)/ σ
Problem:
●
Counting the # of graphs for n>4 is REALY hard
Significant Building Blocks:
Subgraph Function: Feed-Forward Loop
X
Y
●
inputs can be either activators or repressors
●
8 possibilites for FFL
●
consider coherent FFL with AND logic
●
circuit acts as persistence detector
Z: X & Y
X
Y Z
●
X must persist for some time
in order to turn ON Z
●
circuit shuts off very rapidly
●
good way to insulate from noise
time
delay
Going a bit beyond:
Some linear algebra:
Adjacency matrix
A=
Square: A*A is like taking two steps 'forward' in graph, if element is non-zero then
node i & j are connected by two steps
Transpose: T(A) reverses all the directions of the edges, taking a step 'back'
Diagonal: D(A) returns just the diagonal elements of matrix, identifies loops
Complement: U = (1 – D)(A) returns all off-diagonal elements of matrix, identifies open walks
By doing matrix operations we can explore the paths in our network
The Language of Networks:
(from Ziv et al, cond-mat/0306610)
● use A, T, D and U as an alphabet for building up graph “words”
e.g. the number of FFL's is given by Σ(DT(A)AA)
NOTE: a word is an abstraction of a subgraph – many degeneracies
Method:
compute the significance of each 'word' with respect to null random network
Word Finds:
+ signal
E. Coli
A
FFL
Yeast
Classifying Networks:
(from Middendorf et al., q-bio/0402017)
●
The counts for each word in a network give it a fingerprint
Word
Count
●
AATA
5
DUTA
3
DAAUTA
0
UAAAA
10
DUAA
1
Network resides in a high-dimensional “word-space”
Yeast Protein
E. Coli Transcription
Erdos
WWW
The Classifier: SVM
Dividing plane
Class
Votes
Cyan
0/2
Blue
2/2
Red
1/2
Support Vector Machine = find best hyper-plane that divides two classes of data
Multi-class SVM: cast votes using all pair-wise classifications
The Classes:
A) static scale-free
B) small-world (Watts-Strogatz)
C) Erdos-Renyi
D) preferential attachment
E) duplication and mutation
Preferential Attachment
New node
Most likely
node to be
attached too
Duplication-Mutation
Duplicate node
and it's connections
add/subtract
random connections
Classification Results:
●
Classified E.Coli transcription, yeast protein-protein & C. Elegans neural networks
●
All were classified as one of several mutation-duplication models
●
Classification was robust for E. Coli & Yeast
●
Only need a few words to make meaningful distinctions
Conclusions:
●
What can we say about global properties of networks? small-world, scale-free
●
What can we say about local properties? significant subgraphs
●
Can we infer function of subgraphs? modeling and experiments
●
How were graphs made? classify network using different generative models
Dynamics of Network?
boolean networks
differential equations