Improving Lookup Performance over a Widely

Improving Lookup Performance
over a Widely-Deployed DHT
Daniel Stutzbach
Reza Rejaie
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
University of Oregon
INFOCOM
Barcelona, Spain
April 27th, 2006
Distributed Hash Tables (DHTs)

Introduced in 2001





Goal: Allow fast, scalable lookups
Hyped as “second-generation” P2P
Focus of many research papers
For a long time, no significant deployment
Deployment Now:

Overnet: 500,000+




800,000+
All Kademlia based
Kad: 1,000,000+
Performance of a widely-deployed DHT with real churn


Azureus:
How efficient are lookups in practice?
How do parallel lookups improve performance?
How much replication is needed to ensure consistency?
But first some background…
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 2/14
Background: Kademlia

Several features to address churn:

Routing tables contain redundant routes.


Parallel routing quickly bypasses failed peers.


Called k-buckets
Relies on iterative routing
Lookups use prefix-matching

Similar to Pastry
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 3/14
Background: Routing in prefixmatching DHTs







Target: 1 0 0 1 1
If the first x bits match:
Point to a peer with x+b
matching bits, or
Within 1 hop of the closest
peer.
Source’s ID: 0 1 0 1 0
1st Hop’s ID: 1 0 1 0 0




With high probability, need
2nd Hop’s ID: 1 0 0 1 0
to match around log2 n bits.
Improve by b bits per step.
b=2
log 2 n
steps per lookup
b
High b yields quicker lookups, but larger routing tables
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 4/14
Outline

Performance: Theory versus Practice

Theory predicts (log2 n) / b steps per lookup.



Emulate lookups from nodes to addresses.


In practice, average lookups take 3.2 hops.
Revising Theory: Analyzing the average-case




Measure n to be approximately 1 million.
Theory from Kademlia paper predicts 6.3 hops.
k-buckets improve performance.
Enrichment via k-buckets versus increasing b
Parallel Lookup
Ensuring consistency through replication
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 5/14
Theoretical Lookup Performance

Measured n ≈ 1 million peers in Kad.

Developed a fast P2P crawler, called
Cruiser.




Global Internet 2005, IMC 2005
Adapted Cruiser to crawl Kad zones.
Measured thousands of Kad zones.
Kad improves 4 bits on the first step, at
least 3 bits on each additional step.
log 2 1,000,000  4
 1  6.30 steps per lookup
3
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 6/14
Empirical Lookup Performance

Goal: Measure lookup cost between
(node, address) pairs
 Emulate DHT lookups with kLookup





Leverage iterative routing
Probe node A to extract its routing table
Perform the lookup as node A
We can use a variety of different lookup strategies
We found an average lookup takes 3.2 hops,
much better than the predicted 6.3!
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 7/14
Investigating the performance gap

6.3 is a worst-case analysis (sort of).



Based on improving 3 bits per step.
Through chance, the next hop peer may have additional
matching bits.
We derive a formula for average performance.



There are k chances to find a peer with additional matching
bits.
k-buckets dramatically improve average performance.
For k = 20, suggested in the Kademlia paper:



Worst-case is 1 bit per step.
Average-case is 5.7 bits per step!
Is it better to enrich a routing table with k-buckets or
with larger symbols (b)?
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 8/14
Analysis of Enriching Routing Tables

Two ways to increase lookup efficiency:




Larger symbols (b)
Larger buckets (k)
Both proposed in the Kademlia paper, but the
benefits of buckets were not fully considered.
Which yields the most improvement?



See paper for detailed analysis
Asymptotically similar
Larger symbols are better by a constant factor
(around 23% more bits per step)
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 9/14
Empirical Lookup Performance,
Revisited
To compute the average case, we need
to know the size of the k-buckets.
 Kad uses buckets with k = 10.
 However, due to churn:

The buckets are not always full.
 Some entries may point to departed peers.


We need to examine buckets in the wild
to determine how full they are in practice.
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 10/14
Extracting Kad Routing Tables

kFetch: a tool for extracting routing tables




Findings




Systematically generates queries for each k-bucket
TCP-like congestion controlled query rate
Probes each neighbor to determine if it departed
On average, k-buckets have 1 free slot
On average, k-buckets point to 1 or 2 departed peers
Overall, the mean k-bucket has 7.5 useful peers.
We now predict 2.9 steps per lookup.

Reasonably close to the measured 3.2 steps.
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 11/14
Improving Lookup Performance with
Parallel Lookup

Types of Parallel Lookup



Strict
 Have exactly α outstanding lookups
 If we find a better next-hop, wait until one lookup
completes.
 Pro: Limited overhead
Loose
 Always have outstanding lookups to the α best-known
next-hops
 If we find a better next-hop, send a lookup immediately.
 Pro: May be faster
Key Questions:


Which one is better?
How much parallelism (α) is best?
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 12/14
Parallel Lookup




Using parallelism reduces latency from 10 s to 2—3 s.
Diminishing returns after α = 3.
Loose parallelism is slightly faster.
Strict parallelism is much more efficient.
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 13/14
Summary of Contributions




Analysis of average-case performance for prefix-matching DHTs.
 Average performance can be dramatically different from the
worst case.
 k-buckets improve lookup efficiency.
Empirical study of improving performance with parallel lookup
 Strict parallel routing performs better.
 Sweet spot at α = 3 outstanding lookups.
Empirical study of using replication to ensure consistency
 3 copies on nearby peers overcomes lookup inconsistencies
 See paper for details
Tools and techniques:
 Kad Cruiser: Capture the peers in a Kad zone, measure size
 kLookup: Emulate a lookup from any peer to any address
 kFetch: Extract a peer’s routing table
Daniel Stutzbach
The ION P2P Project
http://mirage.cs.uoregon.edu/P2P
Slide 14/14