Fast, Exact Graph Diameter Computation
with Vertex Programming
Vertex-Centric Computing for Large Scale Graph Analytics
Corey Pennycuff and Tim Weninger
SIGKDD Workshop on High Performance Graph Mining
August 10, 2015
Dijkstra’s Single Source Shortest Path
B
F
1
1
A
2
C
1
E
0
G
1
D
2
A
Steps Taken: 𝑂(𝐸 + 𝑉 log 𝑉)
A
B
C
D
E
F
G
0
1
1
1
1
2
2
Medium Graphs
4 million nodes
200 million edges
𝑂(𝐸 + 𝑉 log 𝑉)
= 200,000,000 + 4,000,000 log 4,000,000
= 226,408,239 steps!
Bigger Graphs
DISK
Solution – Hadoop
DISK
DISK
data
mappers
shuffle and sort
2 DISK
result
DISK
DISK
DISK
reducers
DISK
DISK
3
DISK
DISK
4 DISK
Graph Diameter
Diameter of Graph 𝐺 is the longest shortest path.
This is hard
Approximate Solutions:
HADI
Reverse Cuthill-McKee
Random BFS
Bulk Synchronous Parallel (BSP)
Created in 1990 by Les Valiant and Bill McColl at Oxford
DISK
data
DISK
barrier
Superstep 0
Superstep 1
Data kept in memory
Superstep 2
Superstep 3
result
Graph Analytics with BSP
Require the programmer to “think like a vertex”
A
B
F
E
C
D
…
The Vertex
Each Vertex Can:
•
•
•
Receive messages from previous superstep
Modify its value/datum
Send messages
BSP Single Source Shortest Path
B
E
A
F
C
G
D
compute(MessageIterator* msgs){
bool changed = false;
foreach(msg : msgs){
if(msg < datum){
datum = msg;
changed = true;
}
}
if(changed) {
foreach(edge : GetOutEdgeIterator()){
sendMessageTo(edge.dest, datum + edge.weight)
}
}else{
voteToHalt();
}
}
Dijkstra’s Single Source Shortest Path
master
1
B
1
E
A
F
0
C
1
G
1
D
A
Superstep 0
A
B
C
D
E
F
G
0
∞
∞
∞
∞
∞
∞
Dijkstra’s Single Source Shortest Path
1
B
1
E
2
A
F
0
C
1
G
1
D
2
A
Superstep 1
A
B
C
D
E
F
G
0
1
1
1
1
2
2
Dijkstra’s Single Source Shortest Path
1
B
1
E
2
A
F
0
C
1
G
1
D
2
A
Superstep 2
A
B
C
D
E
F
G
0
1
1
1
1
2
2
Supersteps-1 = Node Eccenctricity
1
B
1
E
2
A
F
0
C
1
G
1
D
2
A
Done –
Total Supersteps: O 𝐞𝐜𝐜 𝑨 + 1
Total Messages: Θ(𝐸)
A
B
C
D
E
F
G
0
1
1
1
1
2
2
Diameter Measurement
E
A
F
B
A
C
D
E
C
E
A
F
G
D
D
B
A
F
G
C
B
E
E
A
F
G
C
D
E
A
F
G
C
F
B
B
B
G
G
C
D
D
B
E
A
F
C
G
D
Limitations
Must be synchronous
Designed for unweighted graphs
Performance Results ER-Graphs (p=32%)
Performance Results SF-Graphs (k=3)
Performance Results Real World Graphs
Thank you
© Copyright 2026 Paperzz