Discovering Leaders from Community Actions Amit Goyal Francesco Bonchi Laks V.S. Lakshmanan ICDM 2008 Outline Introduction Framework Definition Influence propagation on the social network Various notions of leaders Algorithms Experiments Conclusion 2 Introduction Word of Mouth and Viral Marketing We are more influenced by our friends than strangers 68% of consumers consult friends and family before purchasing home electronics 4 Viral Marketing Also known as Target Advertising Initiate chain reaction by Word of mouth effect Low investments, maximum gain 5 Our Contributions Formally define the notion of leaders and its various flavors Efficient algorithms for extracting these leaders 6 Framework Definition Input Data (1) A social network, i.e., an undirected graph G=(V,E) where nodes are users and edges represent social ties. Users declare their friends. e.g. Facebook, Yahoo! Messenger etc 8 Input Data (2) An actions log sorted in chronological order, i.e., a relation Actions(User, Action, Time) Example: “Jack” “joined Yoga community” at “time 5” 9 Action Propagation Jack Joined Yoga Community at time 5 3 time units Jill Joined Yoga Community at time 8 Jack and Jill are friends Jack and Mary are friends Action is “Joining the Yoga community” Mary Joined Yoga Community at time 1000 Action Propagated from Jack to Jill Action propagated from Jack to Mary 10 Propagation Graph Jack Joined Yoga Community at time 5 Jill Joined Yoga Community at time 8 Ben Joined Yoga Community at time 15 Joey Mary Joined Yoga Community at time 1000 Joined Yoga Community at time 12 Can we say Mary got influenced by Jack?? NO 11 User Influence Graph Jack When an action propagates from user u to user v, we may think of v being influenced by u Influence should decay in time Size of influence graph << Size of PG Joined Yoga Community at time 5 Jill Joined Yoga Community at time 8 Ben Joined Yoga Community at time 15 Joey Mary Joined Yoga Community at time 1000 Jack Joined Yoga Community at time 5 Joined Yoga Community at time 12 Propagation Graph Jill Joined Yoga Community at time 8 Ben Joined Yoga Community at time 15 Joey Joined Yoga Community at time 12 User Influence Graph for Jack 12 Leaders – first definition Who should be a leader? For an action, should influence sufficiently large number of users ( >=ψ ) For an action, should influence these users in a reasonable amount of time ( <=π ) Should act as a leader in sufficiently large number of actions ( >=σ ) Jack Joined Yoga Community at time 5 3 Jill Joined Yoga Community at time 8 7 4 995 Mary Joined Yoga Community at time 1000 7 Ben 3 Joey Joined Yoga Community at time 12 Joined Yoga Community at time 15 If ψ= 2, π = 15, σ=1 then, both Jack and Jill are leaders 13 Tribe Leader A leader may influence different users for different actions What if a leader lead a fixed set of users for different actions? YES We call these leaders as Tribe Leaders jack A2 A1, A2 and A3 are 3 different actions 14 Additional Constraint: Genuineness It may happen that one user acts as a leader but in concrete he is always a follower of the other leaders We want to avoid this kind of fake leaders. gen(Jill) = 1/3 If gen(v) >= r ,then define v to be a genuine leader. Jack Tom A1 A2 Jill A2 A1, A2 and A3 are 3 different actions 15 Algorithms Algorithms: Overview Assumptions: Social graph is huge – millions of nodes Actions log is huge – millions of tuples For an action, size of user Influence Graph << size of Propagation Graph for all users Our algorithms are able to extract the patterns (leaders and tribe leaders) in no more than one scan of the action log table. 17 Algorithms: Overview Scan the action log table by means of a window of size π backward in time, i.e., starting from the most recent timestamp (bottom of the table if we assume tuples to be ordered by time). Efficiently compute the influence matrix, i.e., a matrix Users x Actions IMπ(u, a) represents number of users, influenced by u w.r.t. action a within time π Compute leaders from IM Jack IM10(Jack, “joining yoga community”) = 3 Joined Yoga Community at time 5 Jill Joined Yoga Community at time 8 Ben Joined Yoga Community at time 15 Joey Joined Yoga Community at time 12 18 Computing Influence Matrix (1) We use a bit vector to track which users are influenced by a given user. Updated incrementally Locking mechanism using another bit vector 0 => free bit; 1 => occupied bit (V,2) (W,1) Node to bit index mapping stored in a queue Head Bits must be dynamically allocated. Node InfVec R 01010111 S 01000110 T 00010110 W 00000110 V 00000100 (T,4) (S,6) (R,0) Queue R Time window on propagation graph 01010111 S T W V Lock bit Vector 19 Computing Influence Matrix (2) Slide up the current window – delete node V Delete the entry from queue Update the lock bit vector Update influence vectors (V,2) (W,1) (T,4) (S,6) (R,0) Queue Head Node InfVec R 01010111 01010011 S 01000110 01000010 T 00010110 00010010 W 00000110 00000010 V 00000100 R Time window on propagation graph 01010011 01010111 S T W V Lock bit Vector 20 Computing Influence Matrix (3) New node P added Issue a lock, add entry to the queue Compute its Influence Vector by propagation (W,1) Number of followers of P = 4 IM(P,a) = 4 Head (T,4) (S,6) (R,0) (P,2) Queue P Node InfVec P Node 01010111 InfVec R 01010011 S 01000010 T 00010010 W 00000010 Time window on propagation graph 01010011 01010111 R S T W Lock bit Vector 21 Mining Tribe Leaders Influence Matrix not enough We use influence cube: Users x Actions x Users ICπ(u,a,v) = 1, when user v is influenced by user u for action a within time π We do not explicitly compute the whole cube due to sparsity. Problem same as discovering existence of frequent itemsets of size larger than a given threshold 22 Experiments Leaders Vs. Tribe leaders π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users 24 Number of leaders found π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users 25 Number of leaders found π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users 26 Run-time π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users 27 Genuineness: an almost binary concept! 28 Conclusions Proposed framework based on frequent pattern mining for discovering leaders in social networks Formally define the problem of extracting leaders from social graph and actions log. Various notions of leader, tribe leader Their genuine variants Efficient algorithms for extracting leaders of various flavors Just one pass over the actions log table 29
© Copyright 2026 Paperzz