Impact of Neighbor Selection on Performance and Resilience of

Impact of Neighbor Selection on
Performance and Resilience of
Structured P2P Networks
Sushma Maramreddy
Overlay networks




An overlay network is a logical network on
top of one or more networks. E.g.
Internet
Main purpose of these networks is to
provide effective means by which a huge
amount of computing links are linked and
accessed.
Various distributed services can be built
on top of these networks.
Structured and Unstructured overlay
networks.
Overlay networks Contd..


Unstructured overlay network – a
peer joins the network by
connecting itself to any node in the
network.
Structured overlay network – a peer
joins the network by connecting
itself to some other well-defined
peers using a logical identifier.
Authors



Byung-Gon Chun, U.C.Berkley
Ben Y.Zhao, U.C.Santa Barbara
John D.Kubiatowicz, U.C.Berkley
Impact of neighbor selection on
performance and Resilience





Introduction
Related work
Details of neighbor selection
Impact of cost functions on
Performance and Resilience
Conclusion
Introduction




Structured overlay networks provide
routing to endpoints or nodes inside the
network requiring logarithmic steps at
each node.
Nodes choose the neighbors based on
optimization metrics.
A recent study by Gummadi has shown
that neighbor selection based on network
proximity significantly increase overall
performance.
Problem – network imbalance.
Introduction Contd..




To better model neighbor selection across
the networks, a generalized cost model is
presented.
Most current protocols only consider
network proximity in neighbor selection
This paper uses different models based on
network proximity and network capacity.
Study the impact they have on lookup
latency and static resilience in tree and
ring geometries.
Related Work



Closest work was done by Gummadi. The
authors quantified the impact of routing
geometry on performance and resilience.
Albert shows a correlation between scalefree nature of networks and resilience to
attacks and failures.
Several researchers propose optimizing
the overlay construction of structured
overlays using network proximity, but
generally ignore CPU load, storage and
bandwidth capacity.
Structured Overlay Construction



Each node chooses neighbors that meet
logical identifier constraints (e.g., prefix
matching or identifier range) and builds
direct links.
These constraints are flexible such that a
no. of nodes are possible for each routing
table entry.
The neighbor selection problem is
reduced to cost minimization problem.
Cost Model




Optimizing neighbor selection for
node i means minimizing the sum of
the cost from i to all nodes.
Cost from i to j consists of two
factors: node cost and edge cost.
Node cost is the cost incurred by
intermediate nodeEdge cost is the cost incurred by
the network links -
Cost Model Contd..




N = network size
t (i, j) = traffic from i, to j.
Cp (i,j) =cost of the path from i to j
V(I,J) = intermediate nodes
Cost Model Contd..


This model captures the
heterogeneity node capacity – a
function of bandwidth, computation
power, disk access time and so on.
For structured networks such as
Cord, Pastry and Tapestry cost
function is defined as follows
Cost function in structured networks






b = neighbor index
nb = neighbor indexed by b
Nb = no. of neighbors
Rb = set of destination through nb
cn(i) = node cost, ce(e) = edge cost
Ce(k,l) = edge cost between k and l.
Neighbor selections





Four neighbor selection models
Random - choose neighbors randomly
Dist – neighbors physically closest in the
network
Cap – neighbors with smallest processing
delay
DistCap – neighbors with smallest
combined delay (sum of node processing
delay and overlay link delay)
Cost functions studied


Cn(i) - processing delay in node i
Ce(i, nb) -direct overlay link delay
between node I and node nb.
Simulation - set up




Simulate Tapestry and Chord protocols as
representatives of tree and ring structures.
Simulations use 5100 node network topologies.
Each node in Chord forwards messages to the live
neighbor closest to the destination. Look up fails
if all neighbors before the destination in the
namespace fail.
For tapestry each node forwards messages to the
first live neighbor matching one more prefix digit.
If all primary and backup links in the routing
entry fail, the lookup fails.
Simulation results - Performance




Two different distributions of node
processing delay.
Uniform and Bimodal distributions
In Uniform we assign the processing
delay uniformly from a/10, 2a/10. … a
where a is the max processing delay.
In Bimodal, nodes are either fast or slow.
Fast nodes can process 100 messages/sec
and slow nodes process 1 message/sec.
Simulation Results - Uniform
Simulation Results - Bimodal
Simulation Results – Static Resilience



Measure resilience as the proportion of all
pairs of live nodes that can still route to
each other after an external event, either
randomized node failures or targeted
attacks.
Assumptions - attacks focus on removing
nodes with the highest in-degree in order
to maximize damage to overall network
reachability.
Assume nodes have an uniform
processing delay distribution with a=0.5s
Satic Resilience


For Tapestry, examine resilience of
the base protocol, base protocol
plus additional backup routes (all
chosen using the neighbor selection
algorithms), base protocol plus
backup routes chosen at random.
For chord we examine the base
protocol, base protocol plus
sequential neighbors.
Random node failures - Tapestry
Random Node failure - Chord
Targeted node attacks - Tapestry
Targeted node attacks - Chord
Simulation Results




Attacking nodes with high in-degree
affects network connectivity severely.
Random shows the best attack tolerance
among neighbor selections
CapDist has the worst attack tolerance
than Dist, although it has better
performance.
This result demonstrates tradeoff between
performance and attack resilience in
structured P2P overlay construction.
Conclusion and Future Work




Took a quantitative approach to examine
the benefits and costs of considering
network or physical characteristics in
overlay construction
The choice of neighbor selection algorithm
drives a tradeoff between performance
and resilience.
If high degree nodes are attacked the
impact on network connectivity is severe.
As future work investigate the resilience
of different geometries under different
neighbor selection algorithms.