Informative Value of Negative Links for Graph

INFORMATIVE VALUE OF
NEGATIVE LINKS FOR GRAPH
PARTITIONING
With an application to European Parliament Votes
Modèles et analyse des réseaux : approches mathématiques et informatiques (MARAMI)
14 Octobre 2015
Israel Mendonça, Rosa Figueiredo,
Vincent Labatut & Philippe Michelon
LIA EA 4128 - Université d’Avignon
LABORATOIRE
INFORMATIQUE
D’AVIGNON
SIGNED NETWORKS
# G (V , E , s) , with s : E → {+, −}
2
SIGNED NETWORKS
# G (V , E , s) , with s : E → {+, −}
# Modeling of systems containing antithetical relationships
2
SIGNED NETWORKS
# G (V , E , s) , with s : E → {+, −}
# Modeling of systems containing antithetical relationships
# Ex.: evolution of pre-WWI alliances between countries [AKR06]
GB
–
F
AH
–
–
+
+
G
–
–
GB
F
R
I
GB
–
–
–
F
+
–
R
+
–
+
French−Russian Alliance 1891−94
G
–
F
+
–
R
–
–
–
G
+
R
I
GB
G
–
+
I
+
German−Russian Lapse 1890
+
+
– +
–
F
AH
–
AH
–
–
+
I
–
+
G
+
+
GB
+
+
I
+
GB
–
Triple Alliance 1882
AH
–
–
R
3 Emperor’s league 1872−81
–
–
–
+
AH
–
–
–
Entente Cordiale 1904
F
+– +
+
–
R
AH
–
–
–
+
+ –
–
–
–
G
+
I
British−Russian Alliance 1907
2
GRAPH PARTITIONING
# Correlation Clustering problem: minimize the imbalance
7
6
+ 9
5
−
1
1
+
+
−
2 2
+
+
+
−
+ 3
4
−
−
+
8
−
+
4 10
+
+
−
9
10
13
+
5
6
+
8
7
11
11
+
+
12
12
3
GRAPH PARTITIONING
# Correlation Clustering problem: minimize the imbalance
7
6
+ 9
5
−
1
1
+
+
−
2 2
+
+
+
−
+ 3
4
−
−
+
8
−
+
4 10
+
+
−
9
10
13
+
5
6
+
8
7
11
11
+
+
12
12
3
GRAPH PARTITIONING
# Correlation Clustering problem: minimize the imbalance
7
6
+ 9
5
−
1
1
+
+
−
2 2
+
+
+
−
+ 3
4
−
−
+
8
−
+
4 10
+
+
−
9
10
13
+
5
6
+
8
7
11
11
+
+
12
12
3
GRAPH PARTITIONING
# Correlation Clustering problem: minimize the imbalance
# Community Detection problem: identify dense clusters
7
6
+ 9
5
−
1
1
+
+
−
2 2
+
+
+
−
+ 3
4
−
−
+
8
−
+
4 10
+
+
−
9
10
13
+
5
6
+
8
7
11
11
+
+
12
12
3
GRAPH PARTITIONING
# Correlation Clustering problem: minimize the imbalance
# Community Detection problem: identify dense clusters
7
6
+ 9
5
−
1
1
+
+
−
2 2
+
+
+
−
+ 3
4
−
−
+
8
−
+
4 10
+
+
−
9
10
13
+
5
6
+
8
7
11
11
+
+
12
12
3
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
−
1
+
5
+
+
6
+
−
+
+
+ 2
+ 7 +
9
+
−
+ 3
8 −
4
− − + −
+
10
11
+
+
12
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
◦ Community detection on positive links only
−
1
+
5
+
+
6
+
−
+
+
+ 2
+ 7 +
9
+
−
+ 3
8 −
4
− − + −
+
10
11
+
+
12
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
◦ Community detection on positive links only
−
1
+
5
+
+
6
+
−
+
+
+ 2
+ 7 +
9
+
−
+ 3
8 −
4
− − + −
+
10
11
+
+
12
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
◦ Community detection on positive links only
◦ Study the community-wise location of negative links
◦ → Most are between communities or non-significant
−
1
+
5
+
+
6
+
−
+
+
+ 2
+ 7 +
9
+
−
+ 3
8 −
4
− − + −
+
10
11
+
+
12
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
◦ Community detection on positive links only
◦ Study the community-wise location of negative links
◦ → Most are between communities or non-significant
# Limitations:
◦ Only 2 datasets, both social networking services
◦ Imbalance assessed only locally
4
RELEVANCE OF NEGATIVE LINKS
# Negative links are costly → relevance for graph partitioning?
# Work by Esmailian et al. [EAJ14]
◦ Community detection on positive links only
◦ Study the community-wise location of negative links
◦ → Most are between communities or non-significant
# Limitations:
◦ Only 2 datasets, both social networking services
◦ Imbalance assessed only locally
# Proposed method:
◦ Consider a different dataset, modeling a different type of
relationships
◦ Compare community detection and correlation clustering
algorithms
4
DATA EXTRACTION
# Raw data:
◦
◦
◦
◦
Nature: Voting activity at the European Parliament
Source: VoteWatch Europe
Period: 7th term (June 2009–June 2014)
Size: 840 MEPs, 1426 documents, 21 topics
5
DATA EXTRACTION
# Raw data:
◦ Nature: Voting activity at the European Parliament
◦ Source: VoteWatch Europe
◦ Period: 7th term (June 2009–June 2014)
◦ Size: 840 MEPs, 1426 documents, 21 topics
# Voting Agreement Index:
◦ Compares two MEPs
◦ Ranges from −1 to +1
◦ Document-wise agreement averaged over all documents
◦ Agreement: +1 (For vs. For, Against vs. Against)
◦ Disagreement: −1 (For vs. Against)
◦ Undetermined: 0 (Abstain/Absent vs. ∗)
5
DATA EXTRACTION
# Raw data:
◦ Nature: Voting activity at the European Parliament
◦ Source: VoteWatch Europe
◦ Period: 7th term (June 2009–June 2014)
◦ Size: 840 MEPs, 1426 documents, 21 topics
# Voting Agreement Index:
◦ Compares two MEPs
◦ Ranges from −1 to +1
◦ Document-wise agreement averaged over all documents
◦ Agreement: +1 (For vs. For, Against vs. Against)
◦ Disagreement: −1 (For vs. Against)
◦ Undetermined: 0 (Abstain/Absent vs. ∗)
# Networks:
◦ Nodes: Members of the European Parliament (MEPs)
◦ Weighted: voting agreement index values
◦ Modes: 264 (time × topics)
5
PARTITIONING ALGORITHMS
# Correlation clustering (G)
◦ Parallel Iterated Local Search (pILS) [LDFF15]
6
PARTITIONING ALGORITHMS
# Correlation clustering (G)
◦ Parallel Iterated Local Search (pILS) [LDFF15]
G
6
PARTITIONING ALGORITHMS
# Correlation clustering (G)
◦ Parallel Iterated Local Search (pILS) [LDFF15]
# Community detection (G+ and G− )
◦
◦
◦
◦
InfoMap [RB08]
EdgeBetweenness [NG04]
WalkTrap [PL05]
FastGreedy [CNM04]
G
6
PARTITIONING ALGORITHMS
# Correlation clustering (G)
◦ Parallel Iterated Local Search (pILS) [LDFF15]
# Community detection (G+ and G− )
◦
◦
◦
◦
G
InfoMap [RB08]
EdgeBetweenness [NG04]
WalkTrap [PL05]
FastGreedy [CNM04]
G+
6
PARTITIONING ALGORITHMS
# Correlation clustering (G)
◦ Parallel Iterated Local Search (pILS) [LDFF15]
# Community detection (G+ and G− )
◦
◦
◦
◦
G
InfoMap [RB08]
EdgeBetweenness [NG04]
WalkTrap [PL05]
FastGreedy [CNM04]
G+
G−
6
PARTITIONING ALGORITHMS
# Correlation clustering (G)
◦ Parallel Iterated Local Search (pILS) [LDFF15]
# Community detection (G+ and G− )
◦
◦
◦
◦
G
InfoMap [RB08]
EdgeBetweenness [NG04]
WalkTrap [PL05]
FastGreedy [CNM04]
G+
G−
6
EXTRACTED NETWORKS
Agreement distribution for all years and topics
15000
# Same observations for all
topics/durations
Frequency
are frequently absent
◦ Right peak: most MEPs
5000
distribution
◦ Left peak: certain MEPs
10000
# Positive side: bimodal
values
◦ Clear majority, in average
0
often vote similarly
# Negative side: less extreme
−1.0
−0.5
0.0
0.5
1.0
Agreement
7
2009
2010
2011
Foreign & Security Policy
Imbalance (%)
8
2012
2013
Term
60
80
PARTITION COMPARISON
5
14
7
7
4
48
4
4
5
40
13
39
25
10
20
2
9
1
3
3
12
574
6
2
11
1
1
2
22
2
15
2
2
11
5
3
6
2
9
2
14
53
0
3323
1
10
2
2
Parallel ILS
Positive
Infomap
Comp. Neg. Positive Comp. Neg. Positive Comp. Neg. Positive Comp. Neg.
Infomap EdgeBetw. EdgeBetw. FastGreedy FastGreedy WalkTrap WalkTrap
8
2009
2010
2011
Foreign & Security Policy
Imbalance (%)
8
2012
2013
Term
60
80
PARTITION COMPARISON
5
14
7
7
4
48
4
4
5
40
13
39
25
10
20
2
9
1
3
3
12
574
6
2
11
1
1
2
22
2
15
2
2
11
5
3
6
2
9
2
14
53
0
3323
1
10
2
2
Parallel ILS
Positive
Infomap
Comp. Neg. Positive Comp. Neg. Positive Comp. Neg. Positive Comp. Neg.
Infomap EdgeBetw. EdgeBetw. FastGreedy FastGreedy WalkTrap WalkTrap
Partition comparison: near-zero NMI
8
CONCLUSION
# Considering negative links on our dataset leads to:
◦ Lower imbalance (at least 3 times better)
◦ Only InfoMap outputs CC-like results (imbalance)
◦ Different partitions (fewer clusters)
# Contradiction with Esmailian et al.’s conclusions
◦ (but not their results)
# Perspectives:
◦ Consider more data, different types of networks (collection)
◦ Exhaustive exploration of vote-based extraction methods
◦ Political interpretation of the VoteWatch results
9
ONLINE RESOURCES
# Data available on Figshare:
◦ http://dx.doi.org/10.6084/m9.figshare.1545599
# R source code available on GitHub:
◦ https://github.com/CompNet/NetVotes
10
REFERENCES
[AKR06]
T. Antal, P. L. Krapivsky, and S. Redner.
Social balance on networks: The dynamics of friendship and enmity.
Physica D, 224(1-2):130–136, 2006.
[CNM04]
A. Clauset, M. E. J. Newman, and C. Moore.
Finding community structure in very large networks.
Physical Review E, 70(6):066111, 2004.
[EAJ14]
P. Esmailian, S. E. Abtahi, and M. Jalili.
Mesoscopic analysis of online social networks: The role of negative ties.
Physical Review E, 90(4):042817, 2014.
[LDFF15] M. Levorato, L. Drummond, Y. Frota, and R. Figueiredo.
An ils algorithm to evaluate structural balance in signed social networks.
In ACM Symposium on Applied Computing, pages 1117–1122, 2015.
[NG04]
M. E. J. Newman and M. Girvan.
Finding and evaluating community structure in networks.
Physical Review E, 69(2):026113, 2004.
[PL05]
P. Pons and M. Latapy.
Computing communities in large networks using random walks.
Lecture Notes in Computer Science, 3733:284–293, 2005.
[RB08]
M. Rosvall and C. T. Bergstrom.
Maps of random walks on complex networks reveal community structure.
Proceedings of the National Academy of Sciences, 105(4):1118, 2008.
1