An Optimal Broadcast Algorithm for Content

18/12/2013 - OPODIS (Nice)
An Optimal Broadcast Algorithm
for Content-Addressable Networks
Ludovic Henrio
Fabrice Huet
Justine Rochas
1
Background
Efficient Algorithm
Experiments
2
General Motivation – RDF Storage
 Context
 Web Semantic: RDF data
 Challenge
 Store and retrieve RDF data
 Large scale setting
 Our solution
 Content Addressable Network
3
Content-Addressable Networks (CAN)
1
 Overlay network
B
 Nodes are peers
E
C
A
 Structured organization
 Multidimensional Cartesian space
D
 Entirely partitioned
dim #2
 Each zone managed by one peer
 A zone = a (hyper)rectangle
0
1
dim #1
 Neighborhood based on adjacent zones
4
Problem: Cost of Queries
2 queries over 2 variables:
conjunction of two 2dimensional broadcast
Naive broadcast does not scale
1 query over 2 variables
1 query over 1 variable
5
Problem: Duplicated Messages
1
 Duplicated messages
 11 peers  40 messages !
E
 How to eliminate duplicates?
 For each peer P
 Find the peer that is reponsible
for sending the message to P
dim #2
0
1
dim #1
6
Existing Solutions
 Use the CAN structure to route messages
 Meghdoot [1] « upperLeft » predicate
 M-CAN [2]
 M-CAN principles
Meghdoot: start from a corner
 Initiator peer sends to all neighbors
 Other peers forward to neighbors on
 Same dimension on opposite side
 Lower dimensions on all sides
 Forwarding on the last dimension depends on a constraint
C
A
B
[1] A. Gupta, O. D. Sahin, D. Agrawal, A. El Abbadi: Meghdoot: Content-Based Publish/Subscribe over P2P Networks. Middleware 2004
7
M-CAN Execution
Message
INIT
Message that leads
to duplication
Corner Constraint
[2] S. Ratnasamy, M. Handley, R. M. Karp, S. Shenker: Application-Level Multicast Using Content-Addressable Networks.
Networked Group Communication 2001
8
Preliminary Work
 Existence of an optimal algorithm proved [3]
 A solution to exhibit existence
 Valid for a very generic definition of CAN
 Not efficient
 Parallelize messages
sending only when
reaching a « border »
[3] Francesco Bongiovanni, Ludovic Henrio: A Mechanized Model for CAN Protocols. FASE 2013
9
Background
Efficient Algorithm
Experiments
10
Hypothesis and Goals
 CAN = adjacent rectangles
INIT
 No additional structure
 Tolerate churns between two Bcast
 Not implementation-dependent
A spanning tree
 Do not tolerate churns during Bcast
 Optimal in number of messages and good parallelization
11
Efficient Algorithm – Principle
 Removes all duplicates
 In all dimensions
spatial constraint in 2D CAN
 How ?
 Uses the corner constraint
 Plus a spatial constraint
 A set of fixed values
 Reduce the problem
 Applies recursively
spatial constraint in 3D CAN
Efficient Algorithm
 Observation #1
 Easy to forward in 1D
 Observation #2
 Only one zone touches a corner
 Idea of the algorithm
 Suppose an efficient broadcast in dimension N
 Apply on a hyperplane of dimension N - 1
 Send to both sides of this hyperplane using the corner constraint
13
Efficient Algorithm – Execution
Message
INIT
Message that leads
to duplication
Corner Constraint
Spatial Constraint
14
Efficient Algorithm – Properties
 Proved to be correct
 All peers receive a broadcast message at least once
 Proved to be minimal
 All peers receive a broadcast
message at most once
"
 Elements of proof – When receiving on dimension D:

" dim < D  contains spatial constraint
 For dim = D  ascending or descending direction

" dim > D  checks corner constraint
This algorithm is optimal
All peers receive a broadcast message exactly once
15
Background
Efficient Algorithm
Experiments
16
Experimental Setup
 Using the Grid5000 platform
 Multisite experimentation
 Deployment
 From 50 to 1500 peers
 Up to 200 physical machines
 CAN setting
 Successively split zones in half
 Zone to split is chosen randomly
17
Number of messages
Maximum gain of 5.3 MB
18
Number of messages
19
Execution Time
Significant speedup
20
Conclusion: Broadcast on CAN
 We found an optimal solution
 Proved to be correct and optimal
 Efficient on large scale settings
 Support range multicast
A range multicast
 Currently in use in the EventCloud project [4]
 Management of RDF data
 Algorithm used for one year
 Tested and approved !
[4] http://www.play-project.eu/solutions/event-cloud
EventCloud
21
Efficient Algorithm – Execution
22