Understanding (Mis)information Spreading for Improving Corporate Network Trustworthiness Candidate: Mara Sorella Advisor: Prof. Roberto Baldoni A.A. 2012/2013 Supervisor: Dott.ssa Silvia Bonomi Outline Misinformation in SN Spreading models Problems Moving to Corporate networks A model for Corporate Social Networks Problem formulation Evaluation: Case Studies Enron DIAG Future works Introduction Social networks: medium for the spread of information Opinions, ideas, information, innovation Direct Marketing exploits word-of-mouth effects to significantly increase profits Spreading of information in SN Two basic classes of graph based diffusion models: Threshold and Cascade Directed G = (V,E) General operational view: ! users = nodes Edges (u,v) can be weighted to represent influence of node u on v. ! Nodes start either active or inactive An active node may trigger activation of neighboring nodes Monotonicity assumption: active nodes never deactivate Linear Threshold A node has random threshold A node is influenced by each neighbor according to a weight such that: ! Activation condition Linear Threshold A node has random threshold A node is influenced by each neighbor according to a weight such that: u 0.4 0.2 ! Activation condition w !w = 0.3 0.5 v !v = 0.6 Linear Threshold A node has random threshold A node is influenced by each neighbor according to a weight such that: u 0.2 0.4 ! Activation condition w !w = 0.3 0.5 v !v = 0.6 Linear Threshold A node has random threshold A node is influenced by each neighbor according to a weight such that: u 0.4 0.2 ! Activation condition w !w = 0.3 0.5 v !v = 0.6 Independent Cascade When a node becomes active, it has a single chance of activating each currently inactive neighbor ! The activation attempt succeeds with probability U 0.2 W 0.5 v Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Example (ICM) Legenda 0.6 inactive node 0.2 0.3 0.4 0.2 0.1 x 0.5 w Stop! active node u successful attempt 0.3 0.5 newly activated node 0.2 v unsuccessful attempt Problems in SN Influence of node set S: f(S) expected number of active nodes at the end, if set S is the initial active set Influence maximization Given set S of nodes is selected for initial activation Problem: Given a parameter k (budget), find a k-node set S to maximize f(S) Misinformation containment Information (set L) and misinformation (set A) are competing. ! Problem: Given a parameter k (budget), find a k-node set L to maximize f(L) [24] Kempe et al. Maximizing the spread of influence through a SN [KDD ’03] [11] Budak et al. Limiting the spread of misinformation in social networks. [WWW ’11] [9] Bharathi et al. Competitive influence maximization in SN [WINE ’07] [31] Nguyen et el. Containment of misinformation spread in OSN. [WebSci ‘12] From SN to Corporate Networking Key point: a hierarchical interpretation exists over the set of entities forming the system. Organizational chart represents the hierarchical organization of a company Alongside, Social Networks are commonly used in Corporate Networks ! Social relationships within the corporation Technological means Corporate Social Network Tools Tools for improving the efficiency of a company Internal SN: true internal social networks for expertise localization i.e. IBM SmallBlue Internal messaging systems i.e. emails, internal chat service Detecting influential nodes Social connections can create potential vulnerabilities as employees that are at the lower levels in the organization chart may become influential thanks to social connections. ! Unexpected influence could be dangerous if the employee behaves maliciously reducing thus the trustworthiness of the overall organization (potential insiders) Therefore, a joint analysis must be performed of: hierarchical relationships imposed by the organizational structure social relationships observed by the presence of a social network among them Main purpose: identifying the global scope for the influence of every node of the network Downline of this, appropriate countermeasures to prevent potential attacks can be taken Towards a CN Model: Hierarchical Network Social Network + Corporate Social Network Network Graph (topology) Influence mapping function Information Diffusion Model Hierarchical Network u Legenda v Employee Hierarchical relationships Ed Er “direct edges” (going down): “reverse edges” (going up): Social Network Legenda u v Employee Hierarchical relationships Es Social Influence Mapping Function no specific constraint over values/relationships, can be derived from the specific social network considered also a superimposition of more social means Corporate Social Network Model Legenda Ed Er Es Merging rules for the Influence Function u v f-Influential nodes identification Influence function of a node expected number of nodes that will be influenced by v at the end of the spreading process Problem (f-influential Nodes Identification) this is done in order to find the f-influential weak nodes Experiments In order to discover the f-influential nodes we study the spread of information with 10000 Monte Carlo simulations from any single node in three different settings and the corresponding graphs H, S and HS. v P is the probability associated to edges (u, v) representing the “u is member of v’s staff” relationship. P u The same experiment is repeated by considering two different values of P: P = 0 and P = 0.5. ! P = 0 supervisors don’t listen at people in their staff. P = 0.5 models the situation in which a supervisor can either decide to accept or not an information coming from a person from his/her staff. The value of f in the experiments is set to 0.5 Study Case: Enron Corp. H graph height 8 recovered by official documents released to the public organizational chart tree-shaped graph (labeled via BFS, 60% leaves). Enron S graph company’s social network represented by email exchanges Influence over edges: associated to the number of emails sent by u to v, (threshold values) Results ~ P=0 Number of Reached Nodes Number of nodes reached by each one of the 151 Enron employees considering the influence given by in the 3 different graphs. 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 0 0 Corporate Social Graph (HS) Organizational Chart (H) Social Network Graph (S) f=0.5 10 20 30 40 50 60 70 80 90 100 110 120 130 Nodes Ordered by Rank “weak” nodes employees are ordered by rank of appearance (BFS) in the organizational chart (0, CEO -150, bottom-level employee) 140 150 Results ~ P=0.5 Number of Reached Nodes Number of nodes reached by each one of the 151 Enron employees considering the influence given by in the 3 different graphs. 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 0 0 Corporate Social Graph (HS) Organizational Chart (H) Social Network Graph (S) f=0.5 10 20 30 40 50 60 70 80 90 100 110 120 Nodes Ordered by Rank “weak” nodes employees are ordered by rank of appearance (BFS) in the organizational chart (0, CEO -150, bottom-level employee) 130 140 150 Study Case: DIAG Dataset: H Graph Derived tree shaped graph (DAG) from publicly available documents Depth 4 Director Full Professor Associate Researcher Expert Engineer PhD Technical/Administrative staff Other DIAG: S Graph Email exchanges within the department members 2 Months of traffic obtained by the Network Administrator that was provided with an obfuscation tool Nov 10 10:42:31 mail postfix/qmgr[6885]: BB3741F69B: from=<[email protected]>, size=1131, nrcpt=1 (queue active) Nov 10 10:42:32 mail postfix/local[7078]: BB3741F69B: to=<[email protected]>, relay=local, delay=2, status=sent (delivered to command: /usr/bin/procmail) regular expression matching of the local Postfix Mail Transfer Agent logfiles only emails among department members were considered no information on email subject/contents obfuscated HDIAG graph Anonymization Flow Anonymized HDIAG Graph Labeled Organizational Graph 11 a b d c In e Ea Random salt node hashing + shuffling list of matches Va Postfix Logs Out dcc7e59 d10ca76 d6ce203 5e62ab4 e16ada9 a : dcc7e59 b : d10ca76 ... postfix/qmgr[6635]: 6CC062712E: from=<[email protected]. it>, size=46785, nrcpt=1 postfix/smtpd[24570]: 6CC062712E: to=<[email protected] >, relay=local, delay=0, status=sent 21 31 In Anonymized HSDIAG Graph Substitution Anonymized SDIAG Graph anonymized logs dcc7e59 41 Anonymized Postfix Log Parsing Out d10ca76 d6ce203 5e62ab4 e16ada9 Results (Role Clustering) Results (Role Clustering) Organizational Chart (H) Social Network Graph (S) Corporate Social Graph (HS) 300 # Reached Nodes 250 200 150 100 50 0 Head Full Prof. Associate Researcher Exp. Eng. PhD Staff Other Results (Role Clustering) Social Network Graph (S) Organizational Chart (H) Corporate Social Graph (HS) 300 # Reached Nodes 250 200 150 100 50 0 Head Full Prof. Associate Researcher Exp. Eng. PhD Staff Other Average Role Spreading Average Role Spreading Average"Role"Spreading"in"H" %"Reached"Nodes" 100.0%( 80.0%( 60.0%( 40.0%( 20.0%( 0.0%( Director( Full(Prof.( Associate( Researcher( Exp(Eng.( PhD( Staff( Other( P=(0( 100.0%( 3.0%( 0.9%( 0.8%( 0.4%( 0.4%( 0.5%( 0.4%( P=0.5( 100.0%( 66.0%( 51.0%( 49.7%( 44.9%( 44.6%( 47.8%( 46.0%( Average Role Spreading Average"Role"Spreading"in"S" 100.0%( %"Nodes"Reached" 80.0%( 60.0%( 40.0%( 20.0%( 0.0%( Director( Full(Prof.( Associate( Researcher( Exp(Eng.( PhD( Staff( Other( 33.6%( 16.0%( 10.0%( 17.0%( 8.6%( 6.3%( 16.0%( 3.6%( Average Role Spreading Average"Role"Spreading"in"HS" %"Nodes"Reached" 100.0%( 80.0%( 60.0%( 40.0%( 20.0%( 0.0%( Director( Full(Prof.( Associate( Researcher( Exp(Eng.( PhD( Staff( Other( P=(0( 100.0%( 59.0%( 27.0%( 48.9%( 24.0%( 17.7%( 46.0%( 9.0%( P=(0.5( 100.0%( 82.7%( 70.0%( 77.0%( 66.2%( 59.0%( 76.0%( 56.0%( Another perspective Community detection performed on the social graph C6 Specific purpose of email exchanges C3 C2 C13 C12 S is assumed to contain the underlying research/ workgroup structure C4 C14 C5 C7 C1 C11 C10 C0 15 clusters identified C8 C9 Cluster Composition Other" PhD" 1" 1" 1" 5" 1" Staff" Exp."Eng." 2" 1" 1" 1" 1" Researcher" Associate" 1" 5" 2" 3" 1" 3" 1" 1" 2" 3" 4" 6" 8" 1" 1" 6" 1" 3" 5" 1" 1" 3" 6" 3" 3" 8" 5" 1" 1" 7" 10% 12" 4" 37" 5" 5" 2" 3" 6" 1" 2" 2" 3" 2" 11" 1% 4" 2" 10" 1" 1" 1" 1" 3% 5" 3" 2" 0" Director" 1" 3" 5" 2" Full"Prof." 1" 3" 1" 3" 4" 5" 6" 7" 3% 7% 8% 4% 9% 2" 8" 1% 1" 2" 3" 9" 10" 11" 12" 13" 14" 11% 7% 20% 3% 4% 9% 1" 1" Results (Clusters) P=0 Results (Clusters) P=0 Organizational Chart (H) Social Network Graph (S) Corporate Social Graph (HS) 300 # Reached Nodes 250 200 150 100 50 0 C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 Comparative Considerations Enron DIAG different aims/purposes profit making company academic department hierarchical structure flat hierarchyzed deep social activity low business intended high (4x) higher level of collaboration teaching activities overall spreading low - position related few unexpected peaks high - position independent many unexpected peaks Future Works Other problems related to enforcing trustworthiness with human-in-the loop explicit constraints on subparts of the organization that have conflicts of interest among them - i.e. banking/financial institutions and supervisory agencies developing of online and offline workforce reorganization algorithms - minimize exchanges between blocks that must be kept isolated - expertise constraints placing a new employee - analyze existing social ties misinformation cascade post-mortem analysis - after the occurrence of an information leak
© Copyright 2026 Paperzz