Networks in Biology 7.6., 14.6., and 21.6.2013 Dr. Katja Nowick [email protected] www.nowick-lab.info Networks in Biology Introduction Dr. Katja Nowick [email protected] www.nowick-lab.info Networks in Biology Networks in cells (molecular networks): • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Identity of the nodes (vertices) and meaning of the links (edges) depends on the studied network Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology A metabolic network is the complete set of metabolic and physical processes that determine the physiological and biochemical properties of a cell. As such, these networks comprise the chemical reactions of metabolism, the metabolic pathways, as well as the regulatory interactions that guide these reactions. It breaks down metabolic pathways (such as glycolysis and the Citric acid cycle) into their respective reactions and enzymes. Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology A gene regulatory network (GRN) is a collection of DNA segments in a cell which interact with each other indirectly (through their RNA and protein expression products) and with other substances in the cell, thereby governing the expression levels of mRNA and proteins. Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Protein–protein interactions occur when two or more proteins bind together, often to carry out their biological function. Many of the most important molecular processes in the cell such as DNA replication are carried out by large molecular machines that are built from a large number of protein components organized by their protein–protein interactions. Picture: Overview of known and predicted protein– protein interactions in pre-40S complexes. The interaction map depicts interactions between the various assembly factors and ribosomal proteins in pre-40S complexes. Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Neurons in the brain are deeply connected with one another and this results in complex networks controlling structural and functional aspects of the brain (e.g. behavior). Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology The immune system is a system of biological structures and processes within an organism that protects against disease. Picture: Antigen-presenting cells (APCs) present antigen on their Class II MHC molecules (MHC2). Helper T cells recognize these, with the help of their expression of CD4 coreceptor (CD4+). The activation of a resting helper T cell causes it to release cytokines and other stimulatory signals (green arrows) that stimulate the activity of macrophages, killer T cells and B cells, the latter producing antibodies. The stimulation of B cells and macrophages succeeds a proliferation of T helper cells. Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology All organisms are connected to each other through feeding interactions. That is, if a species eats or is eaten by another species, they are connected in an intricate food web of predator and prey interactions. Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Describe close and often long-term interaction between two or more different biological species. Picture: In a symbiotic mutualistic relationship, the clownfish feeds on small invertebrates that otherwise have potential to harm the sea anemone, and the fecal matter from the clownfish provides nutrients to the sea anemone. The clownfish is additionally protected from predators by the anemone's stinging cells, to which the clownfish is immune. Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Used to analyze group dynamics, formation of subgroups, decision making etc. Picture: Zachary network of a university karate club: a division into two subgroups over a political issue lead to the formation of two separate clubs (Zachary 1977). Networks in Biology Networks in cells: • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Literally means "the study of what is upon the people“. Used to study the spread of diseases, e.g. virus infections or sexually transmitted diseases. Picture: Distribution Worldwide of Infectious Diseases Directed and undirected networks 1 2 1 3 2 3 4 4 5 5 Directed network Undirected network In-degree of a node = its number of incoming links Out-degree of a node = its number of outgoing links Degree of a node = its number of links PS: node = vertex; link = edge Node degree distribution (= connectivity distribution) In undirected networks, the node degree of a node n is its number of links. A self-loop of a node is counted like two edges for the node degree. The node degree distribution gives the number of nodes with degree k for k = 0,1,…. In directed networks, the in-degree of a node n is the number of incoming links and the out-degree is the number of outgoing links. Similar to undirected networks, there are an in-degree distribution and an out-degree distribution. Characteristics of biological networks 1. Power law behavior Node degree distribution (= connectivity distribution) follows a power law: a few nodes with many links (hubs) and many nodes with only a few links # Nodes It makes no sense to calculate an “average node degree”; networks are often called to be “scale free” mean Long tail Degree Characteristics of biological networks 2. Small world characteristics Every node is connected to every other node by only a small number of links. This property is commonly achieved by a small number of central nodes (hubs) with many connections. The distance between two nodes is the smallest number of nodes that have to be traversed to connect them (= shortest path*). If the average distance between two nodes is 𝑙 ~ log 𝑁 or smaller, then the network has small world characteristics. This is true even for very big networks, e.g. Acquaintance network of the entire world: you know everybody over just 6 connections Facebook: only 4 intermediate people on average between you and anybody else *The length of the shortest path between two nodes n and m is L(n,m). The shortest path length distribution gives the number of node pairs (n,m) with L(n,m) = k for k = 1,2,… and can be used to analyze small-world properties of the network. Characteristics of biological networks 3. Hierarchical and modular organization Module = a natural divisions of a network into groups of nodes such that there are many links within the groups and few links between groups. This organization allows for a) Network robustness: some redundancy to preserve the function even if components fail b) Network evolution: testing of mutations without being fatal for the individual Characteristics of biological networks 4. Certain motifs are common Network motifs = the simplest building blocks of networks recurrent and statistically significant sub-graphs or patterns Examples: Positive/Negative auto-regulation: Positive/negative auto-regulation in which a transcription factor (TF) enhances/represses its own transcription Feed forward loop: Consist of two TFs, one regulating the other and both regulating the same target gene, and can function to accelerate or delay the gene regulation of the target Characteristics of biological networks 4. Preferential attachment Networks typically grow over time by adding nodes. New nodes seem to prefer to connect to nodes that already have many connections: “rich-get richer” Ultimately creates the power law distribution of connectivity distribution e.g. Price’s model: Publications that are already famous tend to be cited more frequently Characteristics of biological networks 5. Are dynamic • Often need to be activated by external or internal signals e.g. Neuronal networks: change in the environment activates a neuronal network leading to change in behavior • Links can be modified to achieve different outputs e.g. Gene regulatory network: cells activate different TFs at different times or locations during development, which affects cell fate • Links can be added or removed e.g. Social networks: new friendships or breaking up of friends • Changes over evolutionary time scales e.g. all molecular networks: duplication of genes, mutations of genes etc. Preferential attachment Why do (some) networks follow a power law distribution? Originally called “cumulative advantage” (Price 1977) term “preferential attachment was coined by Barabasi & Alberts 1999 Price’s model considers directed networks Barabasi and Albert’s model is for undirected networks Preferential attachment Price’s model of a citation network of scientific papers: Directed network Nodes = papers Links = citations c = average number of papers cited by a paper i.e. average out-degree of the network A new paper cites papers at random with probability proportional to the citations the paper has already (plus a constant a, so that probability is not zero; a > 0) Preferential attachment Computational implementation – simulation of the network growth: Start by giving each node a fixed out-degree of c Add new links proportional to the in-degree a node has More precisely: With probability of c/(c+a) attach a new link in proportion to the in-degree of a node, otherwise choose a node uniformly at random from the set of all nodes to chose between these two options create a random number 0 ≤ 𝑟 < 1 then, if r < c/(c+a) choose based on proportion Picking a node uniformly is easy To pick a node proportionally to its in-degree, select a link uniformly at random and then pick the node it points to (a node with in-degree q is q-times more likely to be picked as a node with in-degree 1) Preferential attachment How to do this in practice: Make an array (list) that contains for each node to which other nodes its links point to (order is not important) 1 4 2 1 3 1 3 4 1 2 2 … 5 Then you can simply uniformly at random choose an element from this list This gives you a node proportional to its in-degree Preferential attachment Computational implementation – simulation of the network growth summary 1. Generate a random number r in the range 0 ≤ 𝑟 < 1 2. If r < c/(c+a) choose an element uniformly at random from the array 3. Otherwise choose a node uniformly at random from the set of all nodes 4. Create a new link to the chosen node 5. Update the array by adding the node that got the new link Preferential attachment In other networks World Wide Web Wikipedia vs. a personal homepage Circle of friends “everybody's” favorite vs. the “strange” person Protein-Protein-Interaction networks RNA-polymerase, KAP1 vs. a specialized enzyme Gene regulator network TF with a short vs. one with a long binding motif Immune system B-cells vs. macrophages Food network Generalists vs. Specialists Preferential attachment Extensions to the model: Time Older nodes had more time to acquire links they are expected to have on average more links Experiment by Salganik et al. 2006: Song download preference Website with songs of little-known artists People could download songs for free People were told before, how often a song had been downloaded by others already Observed that songs that had been favored in the beginning of the experiment had a strong advantage and many more downloads in the end Control: repeated the experiment but shuffled the download numbers for each song still, the songs that were (wrongly) reported to have had the highest number of downloads previously won in the end it’s preferential attachment and not real song quality Preferential attachment Extensions to the model: Removing links In Price’s model, links were always added but never removed (You cannot change a published paper by removing citations) But in biological networks it is also typical that links get removed Yet their degree distribution can follow a power law Simple case: to lose a link happens with a probability that is proportional to the number of links a node has (“preferential attachment in reverse”) Note: removing a link affects both nodes that it connects We want the network to grow, so we want that links are added more often than removed It can be shown mathematically that the resulting network still has power law behavior (for details, see chapter 14 in Networks – An Introduction by M.E.J. Newman) Preferential attachment Extensions to the model: Attractiveness So far, all nodes had the same chance of gaining links But this is not very realistic, e.g. some websites are more likely to receive new links (e.g. something like Wikipedia vs. a personal homepage) e.g. some papers are more likely to get cited (e.g. a paper in Science) Some nodes are more “attractive” If the “attractiveness” factor is very strong, we lose the power law behavior Typical parameters analyzed in a network http://med.bioinf.mpi-inf.mpg.de/netanalyzer/help/2.7/index.html#simple Degree of a node = its number of links Nodes with exceptionally many links are “hubs” Parameters related to the neighborhood of a node The neighborhood of a given node n is the set of its neighbors. The connectivity of n, denoted by kn, is the size of its neighborhood. The average number of neighbors indicates the average connectivity of a node in the network. Network centralization: Networks whose topologies resemble a star have a centralization close to 1, whereas decentralized networks are characterized by having a centralization close to 0. The network heterogeneity reflects the tendency of a network to contain hub nodes. Typical parameters analyzed in a network The clustering coefficient of a node is the number of triangles (3-loops) that pass through this node, relative to the maximum number of 3-loops that could pass through the node. Picture: There is one triangle that passes through node b (the triangle bcd). The maximum number of triangles that could pass through b is three (in this case, the pairs (a, c) and (a, d) would be connected additionally). This yields a clustering coefficient of Cb = 1 / 3. Ravasz et al. 2002 used the average clustering coefficient distribution to identify a modular organization of metabolic networks. Which nodes are the most important nodes? Which nodes are the most important nodes? Typical parameters analyzed in a network The betweenness centrality of a node reflects the amount of control that this node exerts over the interactions of other nodes in the network. This measure favors nodes that join communities (dense sub-networks), rather than nodes that lie inside a community. Cb(n) = ∑s≠n≠t (σst (n) / σst), where s and t are nodes in the network different from n, σst denotes the number of shortest paths from s to t, and σst (n) is the number of shortest paths from s to t that n lies on. The betweenness value for each node n is normalized by dividing by the number of node pairs excluding n: (N-1)(N-2)/2, where N is the total number of nodes in the connected component that n belongs to. Picture: The betweenness centrality of node b is computed as follows: Cb(b) = ((σac(b) / σac) + (σad(b) / σad) + (σae(b) / σae) + (σcd(b) / σcd) + (σce(b) / σce) + (σde(b) / σde)) / 6 = ((1 / 1) + (1 / 1) + (2 / 2) + (1 / 2) + 0 + 0) / 6 = 3.5 / 6 ≈ 0.583 Example: Betweenness centrality in a TF network Labeled are the four nodes with highest BC They tend to connect the two modules/communities Nowick et al. 2009 Typical parameters analyzed in a network Closeness centrality is a measure of how fast information spreads from a given node to other reachable nodes in the network . The closeness centrality Cc(n) of a node n is defined as the reciprocal of the average shortest path length and is computed as follows: Cc(n) = 1 / avg( L(n,m) ), where L(n,m) is the length of the shortest path between two nodes n and m. Picture: For example, the closeness centrality of node b is computed as follows: Cc(b) = 1/ ( (L(b, a) + L(b, c) + L(b, d) + L(b, e)) / 4) = 4/ (1 + 1 + 1 + 2) = 4/5 = 0.8 Which nodes are the most important nodes?
© Copyright 2026 Paperzz