Towards Publish/Subscribe Functionality on Graphs

TOWARDS PUBLISH/SUBSCRIBE
FUNCTIONALITY ON GRAPHS
Lefteris Zervakis, Christos Tryfonopoulos, Vinay Setty,
Stephan Seufert, Spiros Skiadopoulos
University of the Peloponnese, Tripolis, Greece
Max Planck Institute, Saarbrücken, Germany
Graphs are everywhere!
■ Social graphs
■ Protein to Protein Interactions
■ Knowledge graphs
■ Communication networks
Towards Publish/Subscribe Functionality on Graphs
2
Evolving graphs
■ Massive in scale
■ Evolving at varying rates
– Wikipedia: Around 100k triples added or
removed daily
– Facebook: Around 144 million new links added
daily
Towards Publish/Subscribe Functionality on Graphs
3
Current approach
■ Search/mining
■ One type/class of queries
■ Usual approach:
– graph indexed
– query evaluated against index
– graph changes: re-computation/incremental
Towards Publish/Subscribe Functionality on Graphs
4
Our approach
■ Publish/subscribe graphs
■ Set of standing queries
– structural constraints
– attribute constraints
■ Publish/subscribe approach
– queries indexed
– graph updates: evaluation against query index
Towards Publish/Subscribe Functionality on Graphs
5
Publish/Subscribe terminology
■ Subscriptions: Standing queries on graphs
–
–
–
–
sub-graph structure
attributes
measures (clustering coefficient, density )
properties (diameter)
Towards Publish/Subscribe Functionality on Graphs
6
Publish/Subscribe terminology
■ Publications: Graphs stream (updates) edge and
node additions and removals, attribute/label
updates
■ Notification: Subgraphs that match standing
queries
Towards Publish/Subscribe Functionality on Graphs
7
Graph publish/subscribe applications
■ Social networks
– target advertising
– community detection
■ Protein to protein interaction (PPI) graphs
– subscription to new interactions
■ Traffic networks, communication networks…
Towards Publish/Subscribe Functionality on Graphs
8
Query indexing algorithms
A
B
E
A
C
B
C
D
Query 1
B
A
C
D
Query 2
Towards Publish/Subscribe Functionality on Graphs
Query 3
9
Brute Force
Query
A
B
E
A
C
B
C
B
A
1
C
2
D
D
Query 1
Query 2
Query 3
Towards Publish/Subscribe Functionality on Graphs
3
Edges
A
A
B
C
B
D
A
B
B
C
C
A
A
A
C
B
D
A
E
A
10
Inverted Index
Key (Edges)
A
B
E
A
C
B
C
B
A
D
C
D
Query 1
Query 2
Query
Query
table
Query 3
Total
Matched
Vertices Vertices
1
3
0
2
3
0
3
4
0
Towards Publish/Subscribe Functionality on Graphs
Value (Query ID)
A
B
1, 2, 3
A
C
1, 3
B
D
1
B
C
2
C
A
2
D
A
3
E
A
3
11
Experimental evaluation: Dataset
■ Wikipedia Page Links
■ Publication events:
– time stamped
– 1,2 million pages (vertices)
– 1 million links (edges)
Towards Publish/Subscribe Functionality on Graphs
12
Experimental evaluation: Dataset
■ Subscriptions:
–
–
–
–
matching: extracted from final graph
non-matching: random
average query lengths (edges): 4, 5, 6
varying profile DB size: 10K, 30K, 50K
Towards Publish/Subscribe Functionality on Graphs
13
Filtering time
Towards Publish/Subscribe Functionality on Graphs
14
Filtering time
Varying query database size
Towards Publish/Subscribe Functionality on Graphs
15
Indexing time
Towards Publish/Subscribe Functionality on Graphs
16
Indexing time
Varying average edges per query
Towards Publish/Subscribe Functionality on Graphs
17
Future work
■ More query classes
– clustering coefficient
– shortest path
– betweenness centrality
■ Tree structures
A
Q3 B
C
D
A
Q1
Q2
Tree structure
Towards Publish/Subscribe Functionality on Graphs
18
Publish/Subscribe on graphs
■ New paradigm
– interesting applications
- proof of concept algorithms and evaluation
– interesting query classes
Towards Publish/Subscribe Functionality on Graphs
19
Thank you for your attention!
Questions?
Towards Publish/Subscribe Functionality on Graphs
20