[.pdf]

http://www.cs.umd.edu/projects/nice
Suman Banerjee
Bobby Bhattacharjee
Christopher Kommareddy
Scalable Application Layer Multicast
1
2
Network-layer Multicast
Replication at routers
B
A
D
C
1
2
D
C
Sequence of Direct Unicasts
Replication only at source
B
A
Group Communication
1
2
D
C
Examples:
• Narada, Yoid, Gossamer, HMTP, Scribe, Bayeux, CAN-multicast, DT, …
• NICE
Replication at end-hosts
B
A
Application-layer Multicast
Tree Quality
1
2
D
C
State / Control Overheads
Metrics
Replication at end-hosts
B
A
Robustness
Application-layer Multicast
Conclusions
Results
NICE Application-layer Multicast Protocol
Introduction
Talk Outline
– Web tickers
Uses a hierarchy
Even low-bandwidth applications are efficient
– Low average and worst case control overheads
– Does not compromise tree quality or robustness
Scales to large group sizes
NICE Application-layer Multicast
– Basic path: Implicitly defined by the hierarchy
– Can be independent of the control path
Data delivery topology
– Detects host failures and re-structure the
overlay
Control topology
NICE Topologies
A Set of Members
NICE Hierarchy
Clusters
Non-overlapping
Proximity-based
Size: k to 3k-1
NICE Hierarchy
C0
B1
B2
Layer 0
Graph-theoretic center is the cluster leader
B0
Clusters
Non-overlapping
Proximity-based
Size: k to 3k-1
NICE Hierarchy
C0
B2
Layer 1
Leaders form the higher layer
and repeats
B0
A1
B1
NICE Hierarchy
B0
log N layers
C0
B1
NICE Hierarchy
B2
Layer 2
A0
B0
A1
Soft state about
all cluster peers
HeartBeats
A2
B1
Control Topology
B2
A2
C0
B1
B2
State and Control message overheads:
Average: Constant
Worst case: O(k log N)
A0
B0
A1
Soft state about
all cluster peers
HeartBeats
Control Topology
A0
B0
A1
A2
Basic Data Path
A0
B0
A1
A2
C0
B1
Basic Data Path
B2
A0
B0
A1
A2
C0
B1
Basic Data Path
B2
A0
B0
A1
A2
C0
Basic Data Path
B2
NICE protocol maintains these invariants
– Leaders for next higher layer
Cluster leader is the central member
Cluster sizes between k and 3k-1
NICE Invariants
Cluster Refine
Cluster Merge
Cluster Split
Member Depart
Member Join
NICE Protocol Operations
Assume a Rendezvous Point
B0
C0
B1
Join Procedure
RP
B2
A3
B0
C0
B1
Join Procedure
B2
Join L0
RP
A3
B0
C0
B1
Join Procedure
RP
B2
A3
L2: {C0}
B0
C0
B1
Join Procedure
Join L0
RP
B2
A3
B0
A1
C0
B1
RP
B2
L1: {B0,B1,B2}
Join Procedure
A3
B0
C0
B1
Join Procedure
RP
B2
Join L0
A3
B0
C0
B1
RP
L0: {…}
L0: {…}
Join Procedure
B2
A3
L0: {…}
L0: {…}
RP
B2
Attach
A3
Overhead: O(log N) RTTs and O(k log N) messages
• Optimizations possible
B0
C0
B1
Join Procedure
B0
Cluster size: 4 to 11
C0
B1
Cluster Split
B2
B0
C0
B1
Cluster Split
B2
B0
C0
B1
Cluster Split
B2
LeaderTransfer
B0
B4
B3
• Each new cluster
has at least
3k/2 members
• Split into two new
clusters
C0
B1
Cluster Split
B2
B4
B3
Join L1
C0
B1
Cluster Split
B2
B4
B0
B3
Leave L1
C0
B1
Cluster Split
B2
B4
B3
C0
B1
Cluster Split
B2
Constant rate data source
Dynamic joins and (ungraceful) leaves
– Members at 8 sites
– Group sizes upto 96
Wide-area Experiments
– 10,000 node Transit-Stub graphs
– Group sizes upto 2048
– Comparisons with Narada [CMU]
Simulations
Results
– Host failures
Robustness
– Control overheads
State at end-hosts
B
A
1
2
D
C
Application-layer Multicast
– Ratio of the overlay latency to the
direct unicast latency
– Example: Stretch for receiver D = 5/3
Tree Quality: Stretch
– Number of copies of the same data
packet on a link/router
– Example: Stress on link [A-1] = 2
Tree Quality: Stress
Evaluation Metrics
0
200
128
members
join
16 members leave
within 10 seconds
Time (in seconds)
1000 1100 1200 1300 1400
Example Scenario
Resource
usage at
links
First 200 seconds …
Tree Quality: Stress
End-to-end
latency to
receivers
First 200 seconds …
Tree Quality: Stretch
After 1000 secs
Failure Recovery
Control Overheads
65.62
199.96
-
128
512
1024
1560
2048
5.18
3.28
2.81
1.93
1.19
1.03
Bandwidth overheads averaged over all network routers
9.23
Narada-30 NICE
32
Group Size
Control Overheads
35.5
33.3
A: cs.ucsb.edu
B: asu.edu
C: cs.umd.edu
D: glue.umd.edu
0.5
0.5
1.7
E: wam.umd.edu
F: umbc.edu
G: poly.edu
H: ecs.umass.edu
Source
4.5
39.4
39.4
Wide-area Testbed
Includes the effects of network losses
Failure Recovery
Includes the effects of network losses
Failure Recovery
“A Comparative Study of Application Layer Multicast Protocols”,
S. Banerjee and B. Bhattacharjee
- Available at: http://www.cs.umd.edu/~suman/publications.html
– Scribe, Bayeux, CAN-multicast, Delaunay-Triangulation
Implicit
– Yoid, HMTP
Tree-first
– Narada, Gossamer
Mesh-first
Related Work
– Video delivery
Implementing applications
– Stress and stretch
Detailed analysis of tree quality
Current Work
http://www.cs.umd.edu/projects/nice
Scalability using hierarchy
– Low control overhead
– Does not sacrifice tree quality or robustness
NICE scales to large member groups
Conclusions