PPT

Constructing Scalable Overlays
for Pub/Sub With Many Topics
Problems, Algorithms, and Evaluation
G. Chockler, R. Melamed, Y. Tock, IBM Haifa Research Lab
R. Vitenberg, University of Oslo
PODC 2007
© 2007 IBM Corporation
Publish/Subscribe (Pub/Sub)
Subscription(N1)={B,C,D}
{A,B,C,E,}
N2
{A,D}
N1
N3
{A,X}
Message Bus
N5
N4
{A,B,X}
© 2007 IBM Corporation
Scalability of Pub/Sub

Most traditional pub/sub systems are geared towards
small scale deployment
– E.g., Isis MDS, TIB, MQSeries, Gryphon

New generation of applications…
– Large data centers: Amazon, Google, Yahoo, EBay,…
– RSS, feed/news readers, on-line stock trading and banking
– Web 2.0, Second Life

…drive dramatic growth in scale
– 10,000s of nodes, 1000s of topics, Internet-wide distribution

Emerging systems address this trend using P2P
techniques
© 2007 IBM Corporation
Overlay-Based Pub/Sub
Relay
{B,C,D}
{A,B,C,E}
N2
{A,D}
N1
N3
N5
{A,X}
N4
{A,B,X}
•SCRIBE
•Corona
•Feedtree
•Sub-2-Sub
•TERA
•...
© 2007 IBM Corporation
Overlay Topologies for Pub/Sub

“Good” overlay will allow for efficient and
simple publication routing
– Small routing tables, low load on relays,
– low latency

Ideally, overlay is topic-connected: i.e.,
one connected component for each topicinduced sub-graph
– Most existing implementations construct topicconnected overlays
© 2007 IBM Corporation
Topic-Connectivity
{B,C,D}
{A,B,C,E}
N2
{A,D}
N1
N3
N5

{A,X}

N4
Topics B,C,X,E
are connected
Topics A and D
are disconnected
{A,B,X}
© 2007 IBM Corporation
Topic-Connectivity: Simple Solution
{B,C,D}
{A,B,C,E}
N2
{A,D}
N1
N3
 Node degree grows linearly with
the subscription size
N5
 Roughly twice as big as the average
subscription size for rings/trees
{A,X}
N4
{A,B,X}
© 2007 IBM Corporation
Scalability of the Simple Solution

Negative impact on performance due to
– CPU load: neighbor monitoring, message processing
– Connection maintenance and header overhead
– Memory overhead: per-link state associated with
routing and/or compression schemes being used, etc.

Scalability barrier for large systems offering a
wide range of subscription choices
Can we do better?
© 2007 IBM Corporation
The Min-TCO Problem

Minimum Topic-Connected Overlay (MinTCO) problem:
– For a set of nodes V, set of topics T, and
Interest: V  T  {true, false}
– Construct a topic-connected overlay G with
the minimum possible number of edges (or
average degree)

TCO (decision version):
– Decide whether there is a topic-connected
overlay consisting of k edges (for a given k)
© 2007 IBM Corporation
Complexity of TCO
{B,C,D}
{A,B}
Lemma: TCO(V,T,Interest,k)NP
N5
N2
Proof: Topic connectivity is verifyable{A,D}
in polynomial time
Lemma: TCO(V,T,Interest,k) is NP-hard
Proof:
N3
N1
1.
2.
3.
{A,B,C,D}
N4
Define an auxiliary problem{A,C}
Single Node TCO (SN-TCO)
which is to decide if there is a topic-connected overlay in
which the degree of single given node  d
Set Cover is polynomially reducible to SN-TCO
SN-TCO is polynomially reducible to TCO
Theorem: TCO is NP-complete
© 2007 IBM Corporation
Approximating Min-TCO

The idea: exploiting subscription overlaps
– Connecting the nodes with overlapping interests
improves connectivity of several topics at once

Greedy Merge (GM) algorithm:
– Start from a singleton connected component for each
(v, t)  V  T
– At each iteration: add an edge that reduces the
number of connected components for the biggest
number of topics
– Stop, once there is a single connected component for
each topic
© 2007 IBM Corporation
Greedy Merge
{B,C,D}
{A,B,C,E}
Topic
# of conn.
comps
A
4
{A,D}
B
3
N3
C
2
N5
D
2
{A,X}
X
2
E
1
N1
N2
N4
{A,B,X}
© 2007 IBM Corporation
Greedy Merge
{B,C,D}
{A,B,C,E}
Topic
# of conn.
comps
A
4
{A,D}
B
2
N3
C
1
N5
D
2
{A,X}
X
2
E
1
N1
N2
N4
{A,B,X}
© 2007 IBM Corporation
Greedy Merge
{B,C,D}
{A,B,C,E}
Topic
# of conn.
comps
A
3
{A,D}
B
1
N3
C
1
N5
D
2
{A,X}
X
2
E
1
N1
N2
N4
{A,B,X}
© 2007 IBM Corporation
Greedy Merge
{B,C,D}
{A,B,C,E}
Topic
# of conn.
comps
A
2
{A,D}
B
1
N3
C
1
N5
D
2
{A,X}
X
1
E
1
N1
N2
N4
{A,B,X}
© 2007 IBM Corporation
Greedy Merge
{B,C,D}
{A,B,C,E}
Topic
# of conn.
comps
A
2
{A,D}
B
1
N3
C
1
N5
D
1
{A,X}
X
1
E
1
N1
N2
N4
{A,B,X}
© 2007 IBM Corporation
Greedy Merge
{B,C,D}
{A,B,C,E}
Topic
# of conn.
comps
A
1
B
1
C
1
N5
D
1
{A,X}
X
1
E
1
N1
N2
{A,D}
of
2 vs.
 Average degree
for ring-per-topic! N3
N4
almost 3
{A,B,X}
© 2007 IBM Corporation
GM Running Time

O(|V|4|T|)
– At most |V|2 iterations
– At most |V|2 edges inspected at each iteration
– At most |T| steps to inspect an edge

Can be optimized to run in O(|V|2 |T|)
– For each e  V  V, weight(e) = the number of
connected components merged by e
– At each iteration, output the heaviest edge and adjust
the other edge weights accordingly
– Stop once there are no more edges with weight > 0
© 2007 IBM Corporation
Approximability Results
Lemma:
1. The number of edges in the overlay constructed by
GM  log(|V||T|) OPT
Proof: Similar to that of the approximation ratio of the greedy
algorithm for Set Cover
2. There exists an input on which GM’s output meets
this ratio
Theorem: No algorithm can approximate Min-TCO within a
constant factor (unless P=NP)
Proof: Existence of such an algorithm would imply existence of the
constant factor approximation for Set Cover which is known to be
impossible (unless P=NP)
© 2007 IBM Corporation
Practical Benefits
© 2007 IBM Corporation
More Overlay Design Problems

Filtering: Given an upper bound d on the node
degree, minimize the number of relays used to
connect each topic
– Captures the cases when full topic-connectivity is
infeasible because of resource constraints

Diameter: Given an upper bound d on the node
degree, minimize the diameter of each topic in
the overlay
– Latency optimal routing under resource constraints

…
© 2007 IBM Corporation
Conclusions
Initiated formal study of the problem of
designing efficient and scalable overlay
topologies for pub/sub
 Defined a representative problem (Min-TCO)
capturing the cost of constructing topicconnected overlays

– NP-Completeness, polynomial approximation,
inapproximability results

Empirical evaluation showed effectiveness of our
approximation algorithm on practical inputs
© 2007 IBM Corporation
Future Directions
Study dynamic case
 Investigate other overlay design problems
 Study distributed case

– Partial knowledge of other node interest
– Dynamically changing interest assignments
© 2007 IBM Corporation
Thank You!
© 2007 IBM Corporation