Traffic-driven model of the World-Wide-Web Graph A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France A. Vespignani, LPT, Orsay, France Outline The WebGraph Some empirical characteristics Various models Weights and strengths Our model: Definition Analysis: analytics+numerics Conclusions The Web as a directed graph l j i in- and outdegrees: nodes i: web-pages directed links: hyperlinks Empirical facts •Small world : captured by Erdös-Renyi graphs With probability p an edge is established among couple of vertices <k> = p N Poisson distribution Empirical facts •Small world •Large clustering: different neighbours of a node will likely know each other n 3 Higher probability to be connected 2 1 =>graph models with large clustering, e.g. Watts-Strogatz 1998 Empirical facts •Small world •Large clustering •Dynamical network •Broad connectivity distributions •also observed in many other contexts (from biological to social networks) •huge activity of modeling (Barabasi-Albert 1999; Broder et al. 2000; Kumar et al. 2000; Adamic-Huberman 2001; Laura et al. 2003) Various growing networks models Barabási-Albert (1999): preferential attachment Many variations on the BA model: rewiring (Tadic 2001, Krapivsky et al. 2001), addition of edges, directed model (Dorogovtsev-Mendes 2000, Cooper-Frieze 2001), fitness (Bianconi-Barabási 2001), ... Kumar et al. (2000): copying mechanism Pandurangan et al. (2002): PageRank+pref. attachment Laura et al. (2002): Multi-layer model Menczer (2002): textual content of web-pages The Web as a directed graph l j nodes i: web-pages directed links: hyperlinks i Broad P(kin) ; cut-off for P(kout) (Broder et al. 2000; Kumar et al. 2000; Adamic-Huberman 2001; Laura et al. 2003) Additional level of complexity: Weights and Strengths l j i Links carry weights/traffic: wij In- and out- strengths Adamic-Huberman 2001: broad distribution of sin Model: directed network j n (i) Growth (ii) Strength driven preferential attachment (n: kout=m outlinks) i “Busy gets busier” AND... Weights reinforcement mechanism j n i The new traffic n-i increases the traffic i-j “Busy gets busier” Evolution equations (Continuous approximation) Coupling term Resolution Ansatz supported by numerics: Results Approximation Total in-weight i sini : approximately proportional to the total number of in-links i kini , times average weight hwi = 1+ Then: A=1+ gsin 2 [2;2+1/m] Numerical simulations Measure of A prediction of g Approx of g Numerical simulations NB: broad P(sout) even if kout=m Clustering spectrum i.e.: fraction of connected couples of neighbours of node i Clustering spectrum • increases => clustering increases • New pages: point to various well-known pages, often connected together => large clustering for small nodes • Old, popular pages with large k: many in-links from many less popular pages which are not connected together => smaller clustering for large nodes Clustering and weighted clustering takes into account the relevance of triangles in the global traffic Clustering and weighted clustering Weighted Clustering larger than topological clustering: triangles carry a large part of the traffic Assortativity Average connectivity of nearest neighbours of i Assortativity •knn: disassortative behaviour, as usual in growing networks models, and typical in technological networks •lack of correlations in popularity as measured by the in-degree Summary Web: heterogeneous topology and traffic Mechanism taking into account interplay between topology and traffic Simple mechanism=>complex behaviour, scale-free distributions for connectivity and traffic Analytical study possible Study of correlations: non-trivial hierarchical behaviour Possibility to add features (fitnesses, rewiring, addition of edges, etc...), to modify the redistribution rule... Empirical studies of traffic and correlations?
© Copyright 2026 Paperzz