3-Spanner Subgraph

A simple Linear Time Algorithm for
Computing a (2π‘˜ βˆ’ 1)-Spanner of
𝑂 π‘˜π‘›1+1/π‘˜ Size in Weighted Graphs
Surender Baswana and Sandeep Sen
Slides and Presentation By: Nadav Hollander
So, what exactly is a
𝑑-spanner?
β€’
A 𝑑-spanner is a subgraph of a given graph that
preserves approximate distance between each pair of
vertices.
β€’
More precisely, a 𝑑-spanner of a graph is a subgraph
where for any pair of vertices, their distance in the
subgraph is at-most 𝑑 times their distance in the
original graph.
β€’
The parameter 𝑑 is called the stretch factor.
3-Spanner Subgraph
Original Graph
3
3
1
1
2
2
1
1
4
4
3
2
4
3
2
3 + 1 + 2 ≀ 3 β‹… 4 = 12
Trivial Example
Not a 3-Spanner
Subgraph
Original Graph
4+3+1β‰°3β‹…2=6
3
3
1
2
1
1
4
4
3
2
1
4
4
2
Example 2
3
Not a 3-Spanner
Subgraph
Original Graph
3
3
1
2
1
5
1
4
4
3
2
4
2
3
2
Example 3
1
Spanner fun facts!
β€’
When discussing spanners, usually the original graph is
connected (two vertices without a path between
them are β€˜infinite’ distance apart).
β€’
Every 𝑑-spanner is a spanning subgraph (not so
surprising). That is, the graph and its spanner both have
the vertex set 𝑉.
β€’
Every connected graph is a 1-spanner of itself.
β€’
Every MST 𝑇 of a graph 𝐺(𝑉, 𝐸) where
𝑉 = 𝑛 is at least a (𝑛 βˆ’ 1)-spanner of 𝐺.
Motivation for Using
𝑑-Spanners
β€’
Quite useful in distributed systems and
communication networks.
β€’
Basically any time a sparse representation of an
existing graph is needed, while keeping a some
relation of the original β€˜distance’ between any two
vertices.
Previous Work
Reliance on global or local distance computation
(derived from computing full shortest path trees
from few vertices, determining the pair-wise
distance, building BFS trees up to a level π‘˜ etc.)
The presented algorithm
Functions without any sort of distance computation
whatsoever, be it global or local.
Computing a
(πŸπ’Œ βˆ’ 𝟏)-Spanner
The Main Tool
As we compute the spanner, we will remove edges
from the graph. We do so in a way that ensures for
each edge 𝑒 removed, we will maintain the following
proposition π’«π‘˜ 𝑒 :
𝒫𝑑 π‘₯, 𝑦 : The vertices π‘₯, 𝑦 are connected in the
subgraph 𝐺(𝑉, 𝐸𝑆 ) by a path consisting of at-most 𝑑
edges, and the weight of each edge on this path is not
more than that of edge (π‘₯, 𝑦).
The Main Tool
𝑑 edges, each ≀ 𝛼
≀𝛼
≀𝛼
≀𝛼
≀𝛼
x
𝑀 𝑒 =𝛼
π“Ÿπ’• (𝒆) holds for 𝑒
y
(Basically, the definition
of 𝑑-Spanner)
A Quick Overview
β€’
At the 1st stage of the algorithm, we partition the
vertex set 𝑉 into clusters (in a randomized manner).
β€’
The formation of clusters will involve deleting edges
from the graph (for which 𝒫𝑑 will hold).
β€’
A cluster is a group of vertices which are β€˜closer’ in a
sense to each other than to other vertices outside the
cluster.
β€’
We will define said β€˜closeness’ as the radius of the
cluster (pretty soon).
A Quick Overview
β€’
Some clusters will form and get bigger (their radius
will increase) at every iteration of 1st stage.
β€’
At some point, the algorithm will be satisfied with the
clusters formed (no more iterations).
β€’
Enter the 2nd stage: the algorithm will then connect
said clusters to each other, which will then form the
spanner in question.
Let’s talk about clusters, shall
we?
Definition 1
β€’ A cluster is a subset of vertices. A partition of a set 𝑉 β€²
βŠ† 𝑉 into clusters is called Clustering.
β€’ each cluster is a singleton set in the beginning, and
other vertices are added to the cluster as the
algorithm proceeds.
β€’ We shall denote this (unique) oldest member of a
cluster as the center of the cluster.
Is the center of its
respective cluster
(We’ll talk about how to choose
vertices to form clusters in the
algorithm itself)
Let’s talk about clusters, shall
we?
Definition 2
β€’ Given a graph 𝐺(𝑉, 𝐸), a set of edges β„° βŠ† 𝐸 induces a
partition of 𝑉 into clusters in the following way:
β€’ Two vertices belong to a cluster if they are connected
by a path Ξ  βŠ† β„°.
β€’ (In other words, each connected component is a
cluster).
β€’ We refer to this clustering as the clustering induced by
β„°.
Is the center of its
respective cluster
The clustering
induced by β„° βŠ† 𝐸
Still Talking About Clusters
Definition 3
β€’ Consider a clustering π’ž induced by some β„° βŠ† 𝐸 in a
given graph. The radius of a cluster 𝑐 ∈ π’ž is the
smallest integer π‘Ÿ such that the following holds:
For each edge π‘₯, 𝑦 adjacent to a cluster
(π‘₯ ∈ 𝑐 ∧ 𝑦 βˆ‰ 𝑐), there is a path in β„° from π‘₯ to 𝑣 (the
center of 𝑐) of at most π‘Ÿ edges, each having weight
not more than that of the edge (π‘₯, 𝑦).
𝑐1
≀ 𝛼, 𝛾
𝛼
Is the center of its
respective cluster
𝛾
The clustering
induced by β„° βŠ† 𝐸
𝑐1 is a cluster of
radius 1, 𝑐2 is a
cluster of radius 2
≀𝛼
≀𝛼
𝑐2
≀𝛽
≀𝛽
𝛽
Still Talking About Clusters
Definition 4
β€’ A clustering π’ž induced by β„° βŠ† 𝐸 is a clustering of
radius ≀ 𝑖 in the graph if each of its clusters has radius
≀ 𝑖.
Further Notations
β€’ β„° π‘₯, 𝑐1 : the edges from the set β„° that are incident
from the vertices of the cluster 𝑐1 to the vertex π‘₯.
β€’ β„° 𝑐1 , 𝑐2 : the set of edges between vertices of cluster
𝑐1 and vertices of cluster 𝑐2 that belong to the set β„°.
𝑐1
≀ 𝛼, 𝛾
𝛼
Is the center of its
respective cluster
𝛿
𝛾
The clustering
induced by β„° βŠ† 𝐸
𝑐1 is a cluster of
radius 1, 𝑐2 is a
cluster of radius 2
≀𝛼
≀𝛽
≀𝛽
𝛽
X
≀𝛼
π’ž = {𝑐1 , 𝑐2 } is a
clustering of radius 2
𝐸 π‘₯, 𝑐2 = {𝛽, 𝛿}
𝑐2
𝛿
𝐸 𝑐1 , 𝑐2 = {𝛼, 𝛿}
We’ll Talk a Lot More About
Clusters
β€’ Before we go on to the description of the algorithm,
there’s still something to observe about the behavior of
clusters, which we’ll cover in the following two
Lemmas.
Lemma 3
β€’
Let π’ž be a clustering of radius 𝑖 induced by β„° in a
graph 𝐺 𝑉 β€² , β„° βˆͺ 𝐸 β€² , and let 𝑐 ∈ π’ž be a cluster.
β€’
For any vertex 𝑒 βˆ‰ 𝑐, picking the least weight
edge from the set 𝐸′(𝑒, 𝑐) in the spanner will
ensure that the proposition 𝒫2𝑖+1 holds for each
edge e ∈ 𝐸′(𝑒, 𝑐).
LEMMA 3
1. 𝐸 β€² 𝑒, 𝑐 = 𝑒, 𝑦 , 𝑒, π‘₯ , …
2. 𝛼 ≀ 𝛽
3. 𝑐 is a cluster of radius ≀ 𝑖
π‘₯
𝑒
𝛼
𝑦
weight ≀ 𝑖 β‹… 𝛼
𝛼 + 𝑖𝛼 + 𝑖𝛽 ≀ 𝛽 + 𝑖𝛽 + 𝑖𝛽 = 2𝑖 + 1 𝛽
β‡’ 𝒫2𝑖+1 𝑒
holds for each 𝑒 ∈ 𝐸 𝑒, 𝑐 !
𝑐
𝑣
Lemma 4
β€’
For a given graph 𝐺 𝑉 β€² , 𝐸 β€² βˆͺ β„° , let π’ž be a
clustering induced by β„° and let 𝑐1 , 𝑐2 ∈ π’ž be two
clusters having radius 𝑖 and 𝑗 respectively.
β€’
Adding the least weight edge of the set 𝐸 β€² 𝑐1 , 𝑐2
to the spanner will ensure that the proposition
𝒫2𝑖+2𝑗+1 holds true for the entire set 𝐸′(𝑐1 , 𝑐2 ).
β€’
(Pretty much an extension for clusters of Lemma
3)
LEMMA 4
1. 𝐸 β€² 𝑐1 , 𝑐2 = 𝑦1 , 𝑦2 , π‘₯1 , π‘₯2 , …
2. 𝛼 ≀ 𝛽
3. 𝑐1 is a cluster of radius ≀ 𝑖
4. 𝑐2 is a cluster of radius ≀ 𝑗
𝑐1
𝑣1
𝛽
π‘₯1
weight ≀ 𝑖 β‹… 𝛼
𝑦1
𝛼
π‘₯2
𝑦2
holds for each 𝑒 ∈ 𝐸 𝑐1 , 𝑐2 !
𝑣2
weight ≀ 𝑗 β‹… 𝛼
𝑖𝛽 + 𝑖𝛼 + 𝛼 + 𝑗𝛼 + 𝑗𝛽 ≀ 𝑖𝛽 + 𝑖𝛽 + 𝛽 + 𝑗𝛽 + 𝑗𝛽 = (2𝑖 + 2𝑗 + 1)𝛽
β‡’ 𝒫2𝑖+2𝑗+1 𝑒
𝑐2
Phase 1 of the Algorithm: forming the
clusters
β€’
β€’
β€’
β€’
β€’
β€’
This phase executes βŒŠπ‘˜/2βŒ‹ iterations.
In each iteration 𝑖 we deal with 𝑉 β€² , 𝐸 β€² , π’žπ‘–βˆ’1 .
𝑉′ is a set of vertices we’ll handle this iteration
(consists of clusters and their neighbors in previous
iteration).
𝐸′ is the set of edges for which 𝒫2𝑖+1 does not hold
yet. Edges from 𝐸′ will either be deleted or added to
the (2π‘˜ βˆ’ 1)-spanner.
π’žπ‘–βˆ’1 is a clustering from the previous iteration.
At 1st iteration: 𝑉 β€² = 𝑉,𝐸 β€² = 𝐸, 𝐸𝑆 = βˆ…,
π’ž1 = 𝑣 𝑣 ∈ 𝑉 .
Phase 1 of the Algorithm: forming the
clusters
Picking a sample ℛ𝑖 of clusters from π’žπ‘–βˆ’1
independently with probability π‘›βˆ’1/π‘˜ .
2. π’žπ‘– is initialized to ℛ𝑖 .
3. For each 𝑣 ∈ 𝑉 β€² βˆ’ ℛ𝑖 :
β—¦ If 𝑣 is not adjacent to any sampled cluster:
ο‚– Then for each cluster 𝑐 ∈ π’žπ‘–βˆ’1 adjacent to 𝑣 we
add to the spanner the least weight edge from 𝑣 to
𝑐.
1.
Phase 1 of the Algorithm: forming the
clusters
β—¦ Else, (𝑣 is adjacent to 1 or more sampled clusters):
ο‚– Let 𝑐 ∈ ℛ𝑖 be the cluster with least weight edge to
𝑣. We add said edge to the spanner and 𝑣 to the
cluster 𝑐 (This is how clusters expand).
ο‚– In addition, for each cluster 𝑐′ ∈ π’žπ‘–βˆ’1 adjacent to 𝑣
with an edge of weight less than the one just
added, we add the least weight edge from 𝑐 β€² to 𝑣.
Phase 1 of the Algorithm: forming the
clusters
4.
Lastly, we remove intra-cluster edges (edges whose
both endpoints belong to same cluster) of clusters
formed this iteration.
5.
Observation: In the clustering π’žπ‘– induced by ℰ𝑖 ,
each cluster 𝑐 ∈ π’žπ‘– is the union of a sampled
cluster from ℛ𝑖 with the set of vertices for whom the
cluster was the nearest neighboring
sampled cluster in π’žπ‘–βˆ’1 .
(Again, this is how clusters expand).
PHASE 1 OF THE ALGORITHM: FORMING THE CLUSTERS
Theorem 1
β€’
β€’
The following assertion holds for each iteration 𝑗
β‰₯ 0:
π’œ(𝑗) : The clustering π’žπ‘— induced by the set ℰ𝑗 in
the algorithm is a clustering of radius 𝑗.
Theorem 1
β€’ Proof by induction.
β€’ Corollary- For each edge 𝑒 ∈ 𝐸′ eliminated from the
graph in the 1st phase, the proposition 𝒫2𝑖+1 holds
true.
β€’ Since there are 𝑖 = π‘˜/2 iterations in the 1st phase, it
follows from the corollary that π’«π‘˜+1 holds for each
edge eliminated from the graph until now.
β€’ Let’s assume we have 𝑛 = 81 vertices, and π‘˜ = 4.
β€’ At 1st iteration we pick a sample from 81 clusters with
4
probability of 𝑛1/π‘˜ = 1/ 81 = 1/3.
β€’ We expect a sample β„›1 of size somewhere around 27
clusters which will be of radius 1 at the end of the
iteration.
β€’ At 2nd iteration, we pick a sample β„›2 from
π’ž1 = 27 clusters, again with probability 1/3.
β€’ At the end of the 2nd iteration, we’ll probably have
somewhere around 9 clusters of radius 2.
β€’ We could go on, but after π‘˜/2 iterations we stop.
Phase 2 of the Algorithm:
cluster-cluster joining
ο‚—
From theorem 1, we know π’ž π‘˜/2 is a clustering of
radius π‘˜/2 .
ο‚—
If π‘˜ is odd, then for each pair of clusters
𝑐1 , 𝑐2 ∈ π’ž π‘˜/2 , we add the least weight edge between
the two clusters to the spanner.
ο‚—
If π‘˜ is even, then for each pair of neighboring clusters 𝑐1
∈ π’ž π‘˜/2 , 𝑐2 ∈ π’ž π‘˜/2 βˆ’1 we add the least weight edge
between the two clusters to the spanner.
PHASE 2: K IS ODD
1. 𝐸 β€² 𝑐1 , 𝑐2 = 𝑦1 , 𝑦2 , π‘₯1 , π‘₯2 , …
2. 𝛼 ≀ 𝛽
3. 𝑐1 is a cluster of radius ≀ π‘˜/2
4. 𝑐2 is a cluster of radius ≀ π‘˜/2
𝑐1
𝑣1
𝛽
π‘₯2
π‘₯1
weight ≀ π‘˜/2 𝛼
𝑦1
𝛼
𝑦2
𝑐2
𝑣2
weight ≀ π‘˜/2 𝛼
π‘˜/2 𝛽 + π‘˜/2 𝛼 + 𝛼 + π‘˜/2 𝛼 + π‘˜/2 𝛽 ≀ 2 π‘˜/2 + 2 π‘˜/2 + 1 𝛽 = 2π‘˜ βˆ’ 2 + 1 𝛽
β‡’ 𝒫2π‘˜βˆ’1 𝑒
holds for each 𝑒 ∈ 𝐸′ 𝑐1 , 𝑐2 !
PHASE 2: K IS EVEN
1. 𝐸 β€² 𝑐1 , 𝑐2 = 𝑦1 , 𝑦2 , π‘₯1 , π‘₯2 , …
2. 𝛼 ≀ 𝛽
3. 𝑐1 is a cluster of radius ≀ π‘˜/2
4. 𝑐2 is a cluster of radius ≀ π‘˜/2 -1
𝑐1
𝑣1
𝛽
π‘₯2
π‘₯1
weight ≀ π‘˜/2 𝛼
𝑦1
𝛼
𝑦2
𝑐2
𝑣2
weight ≀
π‘˜/2 βˆ’ 1 𝛼
π‘˜/2 𝛽 + 𝛼 + 𝛼 + ( π‘˜/2 βˆ’ 1)(𝛼 + 𝛽) ≀ 2 π‘˜/2 + 2( π‘˜/2 βˆ’ 1) + 1 𝛽 = 2π‘˜ βˆ’ 1 𝛽
β‡’ 𝒫2π‘˜βˆ’1 𝑒
holds for each 𝑒 ∈ 𝐸′ 𝑐1 , 𝑐2 !
Running Time and Size Bounds
ο‚—
From probability computations, we find the expected
number of edges contributed to the spanner by a
vertex 𝑣 in iteration 𝑖 is 𝑛1/π‘˜ .
ο‚—
Thus we get a total upper bound of 𝑂(π‘˜π‘›1+1/π‘˜ ) edges
in the spanner at the end of 1st phase.
ο‚—
From probability computations (again), we find the
expected number of edges contributed to the spanner
in the 2nd phase is 𝑂 𝑛1+1/π‘˜ .
ο‚—
We get a total size of 𝑂(π‘˜π‘›1+1/π‘˜ ) for the spanner.
Running Time and Size Bounds
ο‚—
ο‚—
At 1st phase, at each iteration the algorithm picks a
random sample of clusters and finds the neighboring
sampled cluster nearest to each
vertex.
Also at each iteration, the algorithm selects
min 𝐸 β€² 𝑣, 𝑐 for each 𝑣 ∈ 𝑉′.
ο‚—
Thus, every iteration takes 𝑂 𝐸 β€²
ο‚—
The 2nd phase can be completed in 𝑂 𝐸 β€²
𝑂(𝑛) extra space).
ο‚—
Therefore the total running time is 𝑂(π‘˜π‘š).
= 𝑂(π‘š) time.
time (with
Theorem 3
β€’
From the running time of the algorithm, its
correctness and the size of the
(2π‘˜ βˆ’ 1)-spanner computed we can conclude:
β€’
Given a weighted graph 𝐺(𝑉, 𝐸) and integer
π‘˜ > 1, a spanner of stretch (2π‘˜ βˆ’ 1) and
𝑂 π‘˜π‘›1+1/π‘˜ size can be computed in expected
𝑂(π‘˜π‘š) time.
Implementation in
Distributed Model
A Quick Recap
The Distributed Model
β€’ Each node has its own local memory and processor.
β€’ computation takes place in synchronous rounds, where
each round involves passing of messages.
β€’ 3 measures of complexity: number of rounds, total
number of messages and maximum length of any
message.
β€’ computation performed locally at a node in each round
is free.
Implementation in Distributed
Model
β€’ Each link in the distributed network has some positive
length (weight) associated with it.
β€’ The aim is to select 𝑂 π‘˜π‘›1+1/π‘˜ edges (links) ensuring
a stretch of (2π‘˜ βˆ’ 1) for any missing link.
β€’ The algorithm can be adapted in the distributed
environment to compute a (2π‘˜ βˆ’ 1)-spanner in 𝑂(π‘˜ 2 )
rounds and 𝑂(π‘˜π‘š) communication complexity.
Implementation in Distributed
Model
β€’ As a local information each node stores the weight of
each of its links.
β€’ each node also maintains information about the
respective clusters to which it and each of its
neighbors belong as the algorithm proceeds.
β€’ This information is updated through message passing
along the links after each iteration.
β€’ An 𝑖th iteration of phase 1 of the algorithm will be
executed in 𝑂(𝑖) rounds with 𝑂(π‘š) messages passed.
𝑖-th Iteration of Distributed
algorithm
Step 1: Forming a sample of clusters
ο‚—
ο‚—
ο‚—
ο‚—
ο‚—
Center of each cluster 𝑐 ∈ π’žπ‘–βˆ’1 declares 𝑐 to be
sampled with probability π‘›βˆ’1/π‘˜ .
The center passes this information to its neighbors on
cluster 𝑐.
On receiving such message, these neighbors in turn
pass the message to their neighbors in cluster 𝑐.
Since the cluster radius is at most (𝑖 βˆ’ 1), it will take
(𝑖 βˆ’ 1) rounds till each vertex determines whether or
not it belongs to a sampled cluster.
Also note that the total number of messages passed is
𝑂(π‘š).
𝑖-th Iteration of Distributed
algorithm
Step 2: Finding nearest neighboring sampled clusters
ο‚—
Each vertex of a sampled cluster now
declares to each of its neighbors that it is now a
member of a sampled cluster.
ο‚—
Following this, each vertex computes its nearest
neighboring sampled cluster.
𝑖-th Iteration of Distributed
algorithm
Step 3: Adding edges to the spanner
ο‚—
Each vertex selects the edges to be added to the
spanner.
ο‚—
Then it joins the appropriate cluster in the clustering
π’žπ‘– if needed.
ο‚—
Every two neighboring nodes exchange the
information about their new cluster in π’žπ‘– , and discard
the link if they belong to the same cluster.
Running Time and Size Bounds
β€’
We’ve seen that 𝑖th iteration gets executed in 𝑂(𝑖)
rounds and total messages passed in these rounds is
𝑂(π‘š).
β€’
Therefore the total number of rounds for the
algorithm is 𝑂 π‘˜ 2 .
β€’
Total number of messages communicated is 𝑂(π‘˜π‘š).
β€’
Also note that each message is of size 𝑂 log 𝑛 .
β€’
The spanner computed will have expected
𝑂 π‘˜π‘›1+1/π‘˜ edges.