Lecture 6-2 Modularity Maximization Ding-Zhu Du University of Texas at Dallas [email protected] Model-Based Detections • • • • • Connection-based detection Modularity maximization Influence-based detection Overlapping community detection Hierarchy community detection 2 Model-Based Detection Modularity Maximization Is the most popular one 3 Outline Modularity Function Greedy Spectral Method and MP Hybrid Method 4 Modularity Function (Newman 2006) Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define ki k j 1 Q aij Ci ,C j 2 | E | i , jV 2 | E | where ki is the degree of node i, Ci is the community containing i and Ci ,C j is the Kronecher delta symbol. This is the total difference of the fraction of the edges within a community minus the expected number of such fraction if edges were distribute d at random. 5 Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define kj aij ki Q Ci ,C j 2 | E | 2 | E | i , jV 2 | E | where ki is the degree of node i. If an edge was distribute d at random, then it has endpoint i ki with probabilit y and has endpoint j with probabilit y 2| E | kj kj ki . Hence, it lies at (i, j ) with probabilit y . 2| E | 2| E | 2| E | 6 Modularity Function (Newman 2006) Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define a Ci ,C j kj ki Q Ci ,C j 2| E | i , jV 2 | E | 2 | E | where ki is the degree of node i, Ci is the community i , jV ij containing i and Ci ,C j is the Kronecher delta symbol. This is the total difference of the fraction of the edges within a community minus the expected number of such fraction if edges were distribute d at random. 7 Newman 2006 • M.E. J. Newman: Modularity and community structure in networks, Proceedings of the National Academy of Sciences, vol 103 no 23 (2006) pp. 8577-8582. 8 Modularity Function Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define ki k j 1 Q aij Ci ,C j 2 | E | i , jV 2 | E | ki k j 1 aij 2 | E | Ci C j 2 | E | in out 2 ( 2 | E | | E 1 Ci Ci |) in 2 | ECi | 2 | E | Ci 2| E | | E in | 2 | E in | | E out | 2 C Ci Ci i 2 | E | Ci | E | 9 Modularity Function (Newman 2006) Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition (V1 , V2 ,..., Vk ) of V , define L(V ,V ) L(V ,V ) L(V ,V ) 2 s s s s s s Q L(V , V ) s 1 L (V , V ) where L(U , W ) aij . k iU , jW 10 Modularity Function (digraph) Consider a directed graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define in out k 1 i kj Q aij Ci ,C j 2 | E | i , jV 2 | E | where kiin and kiout are in - and out - degree of node i and Ci ,C j is the Kronecher delta symbol. This is the total difference of the fraction of the edges within a community minus the expected number of such fraction if edges were distribute d at random. 11 Why call Modularity? • Module = community in some complex networks • The function describes the quality of modules. 12 Modularity Max is NP-hard • U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner: On modularity clustering, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol 20, no 2 (2008) pp 172-188 13 Outline Modularity Function Greedy Spectral Method Hybrid Method 14 Increment Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , the modularity function is | E in | 2 | E in | | E out | 2 C Ci Ci Q i 2 | E | Ci | E | When community Ci and C j are merged, the increment of Q is | ECi ,C j | | ECi || EC j Ci C j Q 2 2 2 | E | 4 | E | | 15 Greedy Algorithm input a graph G (V , E ); U 1 {{v} | v V }; for k 1 to n 1 do choose Ci and C j from U k to maximize Ci C j Q and U k 1 (U k {Ci , C j }) {Ci C j }; k * arg max Q(U k ) 1 k n output U k * 16 Outline Modularity Function Greedy Spectral Method and MP Hybrid Method 17 Qualified Cut Given a graph G (V , E ), find a subset S of V to maximize Q ( S , S ). Community Partition Apply the Qualified Cut to each part of current partition until value of Q cannot be increasd. 18 Quadratic Form ki k j 1 Q aij Ci ,C j 2 | E | i , jV 2 | E | ki k j 1 ( aij si s j 1) 4 | E | i , jV 2 | E | ki k j 1 aij si s j 4 | E | i , jV 2 | E | 1 T s Bs 4| E | 1 if i is in group 1 si - 1 if i is in group 2 19 Spectral Method 1 T Q s Bs 4| E | achieves the maximum when s is parallel to the eigenvecto r of the largest eigenvalue . 20 Linear Program 1 max Bij (1 xij ) 2 | E | i, j s.t. xik xij x jk for all i, j , k xij {0,1} for all i, j 0 if i and j are in the same community xij 1 if i and j are in different communitie s 21 Vector Program 1 max Bij (1 si s j ) 2 | E | i, j s.t. si2 1 for all i Semi-definite Program 22 Outline Modularity Function Greedy Spectral Method and MP Hybrid Method 23 Resolution limit • Misidentification: some derived communities do not satisfy the weak community definition or even the most weak community definition • In other words, obtained communities may have sparser connection within them than between them. 24 Hybrid Detection: a Possible Research Direction 25 Max Q s.t. condition (1) • • • • • This may give an improvement. Is it possible to do? (1) can be written as linear constraints Q can be written as a quadratic function Thus, Max Q s.t. (1) can be formulated as a quadratic programming, which can be transformed into a semi-definite programming 26 Linear Constraints xik : node vi belongs to the kth community Vk zlk : edge el belongs to the kth community Vk el (vi , v j ) : zlk xik (el Vk vi Vk ) zlk x jk (el Vk v j Vk ) xik x jk 1 zlk (el Vk vi or v j Vk ) 27 Linear Constraints xik : node vi belongs to the kth community Vk zlk : edge el belongs to the kth community Vk Community condition (1) : m n n m 2 zlk xik aij 2 zlk l 1 j 1 i 1 l 1 where m # of edges, n # of nodes. 28 Modularity Density Modularity Density function (Li et al. 2008) Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition (V1 , V2 ,..., Vk ) of V , define L(Vs ,Vs ) L(Vs , Vs ) D | V | s 1 s where L(U , W ) aij . k iU , jW 29 Opt D s.t. condition (1) • • • • • This may give an improvement. Is it possible to do? (1) can be written as linear constraints Q can be written as a fractional function Thus, Max D s.t. (1) can be formulated as a Geometric Programming. 30 Outline Community Structure Connection-Based Detection Influence-Based Detection Remarks 31 Remark 1 How to evaluate the method for finding a community? 32 Clustering 33 Community Detection 34 Remark 2 How to do hierarchy community detection? 35 Survey • Introductory review: Communities in networks by M. A. Porter, J.-P. Onnela, and P. J. Mucha, Notices of the American Mathematical Society 56, 1082 (2009) • Comprehensive review: Community detection in graphs by Santo Fortunato, Physics Reports 486, 75 (2010) 36 THANK YOU!
© Copyright 2026 Paperzz