Building Blocks of Biological Networks Papers: Finding Significant Network Motifs I: Milo, Shen-Orr, Itzkovitz, Kashtan, Chklovskii, Alon, Science 298, 824 (2002) Finding Significant Network Motifs II: Ziv, Koytcheff, Wiggins, cond-mat/0306610 (2003) Classifying Networks using Motifs: Middendorf, Ziv, Adams, Hom, Koytcheff, Levovitz, Woods, Chen, Wiggins, q-bio/0402017 (2004) What are networks? ● collection of objects = nodes ● connected by interactions = edges Networks are Everywhere: (from M. Newman website) Global Properties of Networks: Small World Networks: Stanley Milgram, sociologist, sent letters to random people in Nebraska to see if they might get sent back to a friend in Boston Many letters found their way back home - “six degrees of separation” Short paths between any two nodes Path scaling: D ~ log(N) Can you make the connection? Types of Networks I: ● Social Networks, collaboration networks are small-worlds Erdos-Renyi Graphs: Random graph with k connections per node (or avg <k>/node) has <l> ~ log(N) but not well clustered Watts-Strogatz Model: locally clustered with random long range connections (from M. Newman) small world behaviour Types of Networks II: Scale-free Networks: have highly heterogeneous connectivities – hubs and outliers probability of node with connectivity, P(k) ~ k -γ Examples: world-wide web, internet, metabolic networks, protein networks Hub Outlier (from Jeong, Tombor, Albert, Oltvai & Barabasi, Nature 407, 651 (2000) ) Global Properties of Biological Networks: (from Jeong et al) Indep. of N Metabolic Networks: scale-free but apparently NOT small-world archae eukaryote prokaryote Finding the Building Blocks of Networks: ● Are there small subgraphs that get repeatedly used in larger network? ● Why are they there? what do they do? (from Milo et al, Science 298, 824 (2002)) ● ● count all the occurrences of all subgraphs up to size, n=4. assess significance with respect to randomized version of network Feed Forward Loop ● significance is in some sense determined by the random null model !?! Scoring Motifs: ● There are 13 directed graphs for n = 3, and 199 graphs for n = 4 ● # of occurrences of subgraph in ‘real’ network = Nreal Randomization: ● Graph is randomized by preserving the # of incoming and outgoing edges for each node ● Also preserve the number of counts for (n-1) subgraphs in the randomization ● # of occurrences of subgraph in randomized networks = Nrand +/- σ Scoring: ● Score given by Z-score: Z = (Nreal – Nrand)/ σ Problem: ● Counting the # of graphs for n>4 is REALY hard Significant Building Blocks: Subgraph Function: Feed-Forward Loop X Y ● inputs can be either activators or repressors ● 8 possibilites for FFL ● consider coherent FFL with AND logic ● circuit acts as persistence detector Z: X & Y X Y Z ● X must persist for some time in order to turn ON Z ● circuit shuts off very rapidly ● good way to insulate from noise time delay Going a bit beyond: Some linear algebra: Adjacency matrix A= Square: A*A is like taking two steps 'forward' in graph, if element is non-zero then node i & j are connected by two steps Transpose: T(A) reverses all the directions of the edges, taking a step 'back' Diagonal: D(A) returns just the diagonal elements of matrix, identifies loops Complement: U = (1 – D)(A) returns all off-diagonal elements of matrix, identifies open walks By doing matrix operations we can explore the paths in our network The Language of Networks: (from Ziv et al, cond-mat/0306610) ● use A, T, D and U as an alphabet for building up graph “words” e.g. the number of FFL's is given by Σ(DT(A)AA) NOTE: a word is an abstraction of a subgraph – many degeneracies Method: compute the significance of each 'word' with respect to null random network Word Finds: + signal E. Coli A FFL Yeast Classifying Networks: (from Middendorf et al., q-bio/0402017) ● The counts for each word in a network give it a fingerprint Word Count ● AATA 5 DUTA 3 DAAUTA 0 UAAAA 10 DUAA 1 Network resides in a high-dimensional “word-space” Yeast Protein E. Coli Transcription Erdos WWW The Classifier: SVM Dividing plane Class Votes Cyan 0/2 Blue 2/2 Red 1/2 Support Vector Machine = find best hyper-plane that divides two classes of data Multi-class SVM: cast votes using all pair-wise classifications The Classes: A) static scale-free B) small-world (Watts-Strogatz) C) Erdos-Renyi D) preferential attachment E) duplication and mutation Preferential Attachment New node Most likely node to be attached too Duplication-Mutation Duplicate node and it's connections add/subtract random connections Classification Results: ● Classified E.Coli transcription, yeast protein-protein & C. Elegans neural networks ● All were classified as one of several mutation-duplication models ● Classification was robust for E. Coli & Yeast ● Only need a few words to make meaningful distinctions Conclusions: ● What can we say about global properties of networks? small-world, scale-free ● What can we say about local properties? significant subgraphs ● Can we infer function of subgraphs? modeling and experiments ● How were graphs made? classify network using different generative models Dynamics of Network? boolean networks differential equations
© Copyright 2026 Paperzz