Document

Correctness of Gossip-Based
Membership under Message
Loss
Maxim Gurevich, Idit Keidar
Technion
The Setting
• Many nodes – n
▫ 10,000s, 100,000s, 1,000,000s, …
• Come and go
▫ Churn
• Fully connected network
▫ Like the Internet
• Every joining node knows some others
▫ (Initial) Connectivity
Membership: Each Node Needs To
Know Some Live Nodes
• Applications
▫ Gossip partners
▫ Unstructured overlay networks
▫ Gathering statistics
• Work best with random node samples
▫ Gossip algorithms converge fast
▫ Overlay networks are robust, good expanders
▫ Statistics are accurate
Membership Protocols
• Each node has a view
▫
▫
▫
▫
Set of node ids
Supplied to the application
Used by membership protocol for maintenance
Modeled as a directed graph
w
y
v y w …
u
v
Desirable Properties
• Randomness…
• Holy grail for samples: IID
▫ Each sample uniformly distributed
▫ Each sample independent of other samples
 Avoid spatial dependencies among view entries
 Avoid correlations between nodes
▫ Good load balance among nodes
What About Churn?
Desirable Properties Cont’d
• Views should constantly evolve
▫ Remove failed nodes, add joining ones
• Views should evolve to IID from any state
• Minimize temporal dependencies
▫ Dependence on the past should decay quickly
▫ Useful for application requiring fresh samples
Do Existing Protocols Measure Up?
Existing Work: Practical Protocols
Example:
Push protocol
w
z
w
v … w …
u
v
… … z
w …
• Studied only empirically
▫ Good load balance [Lpbcast, Jelasity et al 07] 
▫ Fast decay of temporal dependencies [Jelasity et al 07] 
▫ Induces spatial dependence 
Existing Work: Analysis
w
Shuffle protocol
z
… … w
z …
v … w
z …
u
w
z
v
• Analyzed theoretically [Allavena et al 05, Mahlmann et al 06]
▫ Uniformity, load balance, spatial independence 
▫ Unrealistic assumptions 
 Atomic actions with bi-directional communication
 No message loss
▫ No bounds on decay of temporal dependencies 
Our Contribution:
Bridge This Gap
• Formally specify desirable properties outlined
above
• A practical protocol
▫ Tolerates message loss, churn, failures
▫ No complex bookkeeping for atomic actions
• Formally prove the desirable properties
▫ Including under message loss
Send & Forget Membership
• The best of push and shuffle
• Some view entries may be empty
w
u w
u
v … w …
v
… … u w
S&F: Message Loss
• Message loss
▫ Or no empty entries in v’s view
w
u
w
v
u
v
S&F: Compensating for Loss
• Edges (view entries) disappear due to loss
• Need to prevent views from emptying out
• Keep the sent ids when too little ids in view
w
u
w
v
u
v
S&F: Advantages over Other Protocols
• No bi-directional communication
▫ No complex bookkeeping
▫ Tolerates message loss
• Simple
▫ Amenable to formal analysis
Easy to
implement
Key Contribution: Analysis
• Proving all desirable properties
▫ Analytical: degrees distribution w/out loss
 Used in setting duplication threshold
▫ Markov 1: degree distribution with loss
▫ Markov 2: Markov Chain of reachable global states
 IID samples, Temporal Independence
• Hold even under (reasonable) message loss!
Analytic Degree Distribution
0.2
Binomial
0.15
S&F Analytical
S&F Markov
0.1
0.05
0
0
10
20
30
40
Node indegree
• Similar (better) to that of a random graph
• Validated by a more accurate Markov model
Key Contribution: Analysis
• Proving all desirable properties
▫ Analytical: degrees distribution w/out loss
 Used in setting duplication threshold
▫ Markov 1: degree distribution with loss
▫ Markov 2: Markov Chain of reachable global states
 IID samples, Temporal Independence
• Hold even under (reasonable) message loss!
Node Degree Markov Chain
outdegree
0
2
4
6
…
3
…
…
…
2
…
…
1
State corresponding to isolated node
Transitions without loss
Transitions due to loss
…
indegree
0
…
• Numerically compute the stationary
distribution
Results
0.25
loss=0
loss=0.01
loss=0.05
loss=0.1
0.2
• Outdegree is bounded by
the protocol
• Decreases with increasing
loss
0.15
0.1
0.05
0
0
20
40
60
80
Node outdegree
0.25
• Indegree is not bounded
• Low variance even under
loss
• Typical overload at most 2x
loss=0
loss=0.01
loss=0.05
loss=0.1
0.2
0.15
0.1
0.05
0
0
10
20
30
Node indegree
40
Key Contribution: Analysis
• Proving all desirable properties
▫ Analytical: degrees distribution w/out loss
 Used in setting duplication threshold
▫ Markov 1: degree distribution with loss
▫ Markov 2: Markov Chain of reachable global states
 IID samples, Temporal Independence
• Hold even under (reasonable) message loss!
Decay of Spatial Dependencies
w
w
…
u
v
u does not delete
the sent ids
u
v
• For uniform loss < 15%, dependencies decay faster
than they are created
• 1 – 2loss rate fraction of view entries are independent
▫ E.g., for loss rate of 3%  more than 90% of entries are
independent
Temporal Independence
• Dependence on past views decays within
O(log n  view size) time
• Use “expected conductance”
• Ids travel fast enough
▫ Reach random nodes in O(log n) hops
▫ Due to “sufficiently many” independent ids in views previous slide
Conclusions
• Formalized the desired properties of a
membership protocol
• Send & Forget protocol
▫ Simple for both implementation and analysis
• Analysis under message loss
▫
▫
▫
▫
Load balance
Uniformity
Spatial Independence
Temporal Independence
Thank You