BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park [email protected] BCAM January 2011 2 Complex networks • Many examples – Biology (Genomics, protonomics) – Transportation (Communication networks, Internet, roads and railroads) – Information systems (World Wide Web) – Social networks (Facebook, LinkedIn, etc) – Sociology (Friendship networks, sexual contacts) – Bibliometrics (Co-authorship networks, references) – Ecology (food webs) – Energy (Electricity distribution, smart grids) • Larger context of “Network Science” BCAM January 2011 3 Objectives • Identify generic structures and properties of “networks • Mathematical models and their analysis • Understand how network structure and processes on networks interact What is new? Very large data sets now easily available! • Dynamics of networks vs. dynamics on networks BCAM January 2011 4 Bibliography (I): Random graphs • N. Alon and J.H. Spencer, The Probabilistic Method (Second Edition), Wiley-Science Series in Discrete Mathematics and Optimization, John Wiley & Sons, New York (NY) 2000. • A. D. Barbour, L. Holst and S. Janson, Poisson Approximation, Oxford Studies in Probability 2, Oxford University Press, Oxford (UK), 1992. • B. Bollobás, Random Graphs, Second Edition, Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge (UK), 2001. • R. Durrett, Random Graph Dynamics, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge (UK), 2007. BCAM January 2011 • M. Draief and L. Massoulié, Epidemics and Rumours in Complex Networks, Cambridge University Press, Cambridge (UK), 2009. • D. Dubhashi and A. Panconesi, Concentration of Measure for the Analysis of Randomized algorithms, Cambridge University Press, New York (NY), 2009. • M. Franceschetti and R. Meester, Random Networks for Communication: From Statistical Physics to Information Systems, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge (UK), 2007. • S. Janson, T. Luczak and A. Ruciński, Random Graphs, Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, 2000. • R. Meester and R. Roy, Continuum Percolation, Cambridge University Press, Cambridge (UK), 1996. 5 BCAM January 2011 • M.D. Penrose, Random Geometric Graphs, Oxford Studies in Probability 5, Oxford University Press, New York (NY), 2003. 6 BCAM January 2011 7 Bibliography (II): Survey papers • R. Albert and A.-L. Barabási, “Statistical mechanics of complex networks,” Review of Modern Physics 74 (2002), pp. 47-97. • M.E.J. Newman, “The structure and function of complex networks,” SIAM Review 45 (2003), pp. 167-256. BCAM January 2011 Bibliography (III): Complex networks • A. Barrat, M. Barthelemy and A. Vespignani, Dynamical Processes on Complex Networks, Cambridge University Press, Cambridge (UK), 2008. • R. Cohen and S. Havlin, Complex Networks - Structure, Robustness and Function, Cambridge University Press, Cambridge (UK), 2010. • D. Easley and J. Kleinberg, Networks, Crowds, and Markets: Reasoning About a Highly Connected World, Cambridge University Press, Cambridge (UK) (2010). • M.O. Jackson, Social and Economic Networks, Princeton University Press, Princeton (NJ), 2008. 8 BCAM January 2011 • M.E.J. Newman, A.-L. Barabási and D.J. Watts (Editors), The Structure and Dynamics of Networks, Princeton University Press, Princeton (NJ), 2006. 9 BCAM January 2011 10 LECTURE 1 BCAM January 2011 11 Basics of graph theory BCAM January 2011 12 What are graphs? With V a finite set, a graph G is an ordered pair (V, E) where elements in V are called vertices/nodes and E is the set of edges/links: E ⊆V ×V E(G) = E Nodes i and j are said to be adjacent, written i ∼ j, if e = (i, j) ∈ E, i, j ∈ V BCAM January 2011 Multiple representations for G = (V, E) Set-theoretic – Edge variables {ξij , i, j ∈ V } with 1 if (i, j) ∈ E ξij = 0 if (i, j) ∈ /E Algebraic – Adjacency matrix A = (Aij ) with 1 if (i, j) ∈ E Aij = 0 if (i, j) ∈ /E 13 BCAM January 2011 14 Some terminology Simple graphs vs. multigraphs Directed vs. undirected (i, j) ∈ E if and only if (j, i) ∈ E No self loops (i, i) ∈ / E, i∈V Here: Simple, undirected graphs with no self loops! BCAM January 2011 15 Types of graphs • The empty graph • Complete graphs • Trees/forests • A subgraph H = (W, F ) of G = (V, E) is a graph with vertex set W such that W ⊆V and F = E ∩ (W × W ) • Cliques (Complete subgraphs) BCAM January 2011 16 Labeled vs. unlabeled graphs A graph automorphism of G = (V, E) is any one-to-one mapping σ : V → V that preserves the graph structure, namely (σ(i), σ(j)) ∈ E if and only if Group Aut(G) of graph automorphisms of G (i, j) ∈ E BCAM January 2011 17 Of interest • Connectivity and k-connectivity (with k ≥ 1) • Number and size of components • Isolated nodes • Degree of a node: degree distribution/average degree, maximal/minimal degree • Distance between nodes (in terms of number of hops): Shortest path, diameter, eccentricity, radius • Small graph containment (e.g., triangles, trees, cliques, etc.) • Clustering • Centrality: Degree, closeness, in-betweenness BCAM January 2011 18 For i, j in V , `ij = Shortest path length between nodes i and j in the graph G = (V, E) Convention: `ij = ∞ if nodes i and j belong to different components and `ii = 0. Average distance `Avg XX 1 = `ij |V |(|V | − 1) i∈V j∈V Diameter d(G) = max (`ij , i, j ∈ V ) BCAM January 2011 19 Eccentricity Ec(i) = max (`ij , j ∈ V ) , i∈V Radius rad(G) = min (Ec(i), i ∈ V ) `Avg ≤ d(G) and rad(G) ≤ d(G) ≤ 2 rad(G) BCAM January 2011 20 Centrality Q: How central is a node? Closeness centrality g(i) = P 1 j∈V `ij , i∈V Betweenness centrality b(i) = X k6=i, k6=j, σkj (i) , σkj i∈V BCAM January 2011 21 Clustering Clustering coefficient of node i P C(i) = j6=i, k6=i, j6=k ξij ξik ξkj P j6=i, k6=i, j6=k ξij ξik Average clustering coefficient CAvg 1X = C(i) n i∈V Number of fully connected triples C =3· Number of triples BCAM January 2011 22 Random graphs BCAM January 2011 23 Random graphs? G(V ) ≡ Collection of all (simple free of self-loops undirected) graphs with vertex set V . Definition – Given a probability triple (Ω, F, P), a random graph is simply a graph-valued rv G : Ω → G(V ). Modeling – We need only specify the pmf {P [G = G] , G ∈ G(V )}. Many, many ways to do that! BCAM January 2011 24 Equivalent representations for G Set-theoretic – Link assignment rvs {ξij , i, j ∈ V } with 1 if (i, j) ∈ E(G) ξij = 0 if (i, j) ∈ / E(G) Algebraic – Random adjacency matrix A = (Aij ) with 1 if (i, j) ∈ E(G) Aij = 0 if (i, j) ∈ / E(G) BCAM January 2011 25 Why random graphs? Useful models in many applications to capture binary relationships between participating entities Because |G(V )| = ' 2 2 |V |(|V |−1) 2 |V |2 2 A very large number! there is a need to identify/discover typicality! Scaling laws – Zero-one laws as |V | becomes large, e.g., V ≡ Vn = {1, . . . , n} (n → ∞) BCAM January 2011 26 Ménagerie of random graphs • Erdős-Renyi graphs G(n; m) • Erdős-Renyi graphs G(n; p) • Generalized Erdős-Renyi graphs • Geometric random models/disk models • Intrinsic fitness and threshold random models • Random intersection graphs • Growth models: Preferential attachment, copying • Small worlds • Exponential random graphs • Etc BCAM January 2011 27 Erdős-Renyi graphs G(n; m) With n n(n − 1) 1≤m≤ = , 2 2 the pmf on G(Vn ) is specified by −1 u(n; m) P [G(n; m) = G] = 0 if |E(G)| = 2m if |E(G)| = 6 2m where n(n−1) u(n; m) = 2 m BCAM January 2011 Uniform selection over the collection of all graphs on the vertex set {1, . . . , n} with exactly m edges 28 BCAM January 2011 29 Erdős-Renyi graphs G(n; p) With 0 ≤ p ≤ 1, the link assignment rvs {χij (p), 1 ≤ i < j ≤ n} are i.i.d. {0, 1}-valued rvs with P [χij (p) = 1] = 1 − P [χij (p) = 0] = p, 1≤i<j≤n For every G in G(V ), P [G(n; p) = G] = p |EG| 2 · (1 − p) n(n−1) |EG| − 2 2 BCAM January 2011 30 Related to, but easier to implement than G(n; m) Similar behavior/results under the matching condition |E(G(n; m))| = E [|E(G(n; p))|] , namely n(n − 1) m= p 2 BCAM January 2011 31 Generalized Erdős-Renyi graphs With 0 ≤ pij ≤ 1, 1≤i<j≤n the link assignment rvs {χij (p), 1 ≤ i < j ≤ n} are mutually independent {0, 1}-valued rvs with P [χij (pij ) = 1] = 1 − P [χij (pij ) = 0] = pij , 1≤i<j≤n An important case: With positive weights w1 , . . . , wn , wi wj pij = W with W = w1 + . . . + wn BCAM January 2011 32 Geometric random graphs (d ≥ 1) With random locations in Rd at X 1, . . . , X n, the link assignment rvs {χij (ρ), 1 ≤ i < j ≤ n} are given by χij (ρ) = 1 [ kX i − X j k ≤ ρ ] , where ρ > 0. 1≤i<j≤n BCAM January 2011 33 Usually, the rvs X 1 , . . . , X n are taken to be i.i.d. rvs uniformly distributed over some compact subset Γ ⊆ Rd Even then, not so obvious to write P [G(n; ρ) = G] , G ∈ G(V ). since the rvs {χij (ρ), 1 ≤ i < j ≤ n} are no more i.i.d. rvs For d = 2, long history for modeling wireless networks (known as the disk model) where ρ interpretated as transmission range BCAM January 2011 34 Threshold random graphs Given R+ -valued i.i.d. rvs W1 , . . . , Wn with absolutely continuous probability distribution function F , i∼j if and only if Wi + Wj > θ for some θ > 0 Generalizations: i∼j if and only if R(Wi , Wj ) > θ for some symmetric mapping R : R2+ → R+ . BCAM January 2011 35 Random intersection graphs Given a finite set W ≡ {1, . . . , W } of features, with random subsets K1 , . . . , Kn of W, i∼j if and only if Ki ∩ Kj 6= ∅ Co-authorship networks, random key distribution schemes, classification/clustering BCAM January 2011 36 Growth models {Gt , t = 0, 1, . . .} with rules Vt+1 ← Vt and Gt+1 ← (Gt , Vt+1 ) • Preferential attachment • Copying Scale-free networks BCAM January 2011 37 Small worlds • Between randomness and order • Shortcuts • Short paths but high clustering Milgram’s experiment and six degrees of separation BCAM January 2011 38 Exponential random graphs • Models favored by sociologists and statisticians • Graph analog of exponential families often used in statistical modeling • Related to Markov random fields BCAM January 2011 39 With I parameters θ = (θ1 , . . . , θI ) and a set of I observables (statistics) ui : G(V ) → R+ , we postulate P [G = G] = e PI i=1 θi ui (G) , Z(θ) G ∈ G(V ) with normalization constant Z(θ) = X G∈G(V ) e PI i=1 θi ui (G) BCAM January 2011 40 In sum • Many different ways to specify the pmf on G(V ) – Local description vs. global representation – Static vs. dynamic – Application-dependent mechanisms
© Copyright 2026 Paperzz