slides

Navigation in small worlds
Social Networks: Models and Applications
Seminar
Toronto, Fall 2007
(based on a presentation by Stratis Ioannidis)
the small-world phenomenon
“most people are linked by short chains of acquaintances”
2
Milgram’s experiment (1960s)
►
people in Omaha, Nebraska, were each given a letter addressed to
a target person in Boston, Massachusetts, along with demographic
information (name, address, profession) on this person.
►
they were asked to send the letter to the target person, by
forwarding it to other people that they knew on a first-name basis,
instructing them to do the same.
►
median number of hops to get the letter to the target: 6
-> six degrees of separation
3
significance of small-world phenomenon
►
qualitatively similar results by subsequent experiments on e.g.
[Dodds et al. ‘03]
►
small-world phenomenon also appears in other networks:







powergrid
actor collaboration graph
WWW
neural network of C. elegans
semantic networks of languages
food webs
…
4
modeling the small-world phenomenon
►
small-world network model:
1. short paths between almost all pairs of nodes
2. small node degree (on average)
3. locally clustered:
a node’s neighbors are likely to be neighbors of each other
►
►
a graph selected at random from all n-node graphs where each
node has degree =3, has diameter O(logn), whp
but, does not satisfy clustering requirement
5
[Watts-Strogats ‘98]’s model
►
►
d-dimensional lattice of nd nodes
for each node u:
[here, d=2]
 local edges to nodes v, s.t. dist ρ(u,v) ≤ p
[p=2]
 long-range directed edges to q random nodes selected
independently & uniformly over all nodes
[q=3]
►
expected diameter O(logn)
6
[Kleinberg’00]: a new perspective
on Milgram’s experiment
►
“short paths not only exist, but can be found by individuals using
only local information !”
►
proposed a simple extension to Watts-Strogats’ model
used that to demonstrated that:
►
 ability to route efficiently with local information ≠ network diameter
 this ability is affected by the correlation between local structure and
long-range connections; efficient routing is possible only when this
correlation is near a critical threshold; as we move away from this
threshold routing deteriorates rapidly.
7
Kleinberg’s (grid-based) model
extends model of [Watts-Strogats ‘98]:
► d-dimensional lattice of nd nodes
► for each node u:
 local edges to nodes v, s.t. dist ρ(u,v) ≤ p
 long-range directed edges to q random nodes selected
independently & uniformly over all nodes
s.t. Pr(u->v) ~ ρ(u,v)-a
►
a: concentration of long-range neighbors around u
 a: small
-> a = 0
 a: large
-> a = ∞
connections close to uniformly random
[Watts-Strogats ‘98] model
strong preference for close connections
long-range neighbors = local neighbors
8
decentralized algorithms
decentralized algorithm for transmitting messages:
► at each step the holder u of the message passes it to one of its
neighbors (local or long-range)
► u knows only
 the underlying grid structure
 the location of the target on the lattice
 the location and the long-range neighbors of all nodes that have
touched the message so far
delivery time T:
► expected number of steps to forward a message from a random
source to a random target
9
Kleinberg’s results
when d = 2
1. for 0 ≤ a < 2, any decentralized algorithm has T = Ω(n(2-a)/3)
2. for a = 2, there is a decentralized algorithm s.t. T = O(log2n)
3. for a > 2, any decentralized algorithm has T = Ω(n(a-2)/(a-1))
can be extended corresponding
for d ≠ 2, with diameter
0 ≤ a < d, aresults
= d, and a > d, respectively
[Martel-Nguyen ’04+’05]:
1. for
Θ(logn)
the upper bound when
a a= ≤2d,
is achieved
by greedy algorithm:
2. for
d < a < 2d,
► a node forwards
a message
for Polylog(n)
t to its neighbor v such that ρ(v,t)
??
is minimum 3. for a = 2d,
4. for a > 2d,
Poly(n)
10
outline of proof of the upper bound
►
►
►
►
►
a = 2, p = q = 1
in each step, the dist. from current node u to target t is halved
with prob. ~1/logn
[Ω(1/logn)]
so, the expected number of steps until from u we reach a node u’
such that ρ(u’,t) ≤ ρ(u,t)/2 is at most ~logn
[O(logn)]
the target is reached after at most logn+1 halvings, so, in ~log2n
expected steps
[O(log2n)]
crucial property of a = 2:
it produces long-range neighbors approx. uniformly distributed
over “distance scales”: for u’s long-range neighbor v, the
probability that 2j≤ρ(u,v) ≤2j+1, is the same for all j
11
outline of proof of lower bounds
►
►
►
►
►
►
►
►
►
a = 0, p = q = 1
Let U: set of nodes w s.t. ρ(w,t) < n2/3
|U|~ n4/3
Prob(s U) ~ |U|/n2 ~ 1/n2/3
-> almost certainly s U
if s U and no node u in the path to t has a long-range neighbor in
U, then the number of steps to t are ≥ n2/3
for any u, Prob(u->U) = |U|/n2 = 1/n2/3
-> starting from s U, the expected number of steps to reach a
node with a long-range neighbor in U is ~ 1/ Prob(u->U) = n2/3
expected number of steps to t is ≥ n2/3
a = ∞, p = q = 1
the random graph is the grid; expected number of steps ~n
13
hierarchical model [Kleinberg 01]
b=3
ρ(u,v) = 2
u
►
►
►
►
v
natural model for categorizing occupation, web pages,…
ρ(u,v): height of lowest common ancestor of u,v
polylog(n) long-range neighbors from distr ~ b-aρ(u,v);
efficient routing only for a = 1
[Watts et al. 02]: many indep. trees; ρ the min of dist. in any tree
14
group-based model [Kleinberg 01]
ρ(u,v) = 6
►
►
►
u
v
set of groups {S1,S2,…}
“bounded growth”: if Sj,Sk,… have sizes < g and all contain u, their
union’s size is O(g)
ρ(u,v): size of smallest group containing u,v
polylog(n) long-range neighbors from distr ~ ρ(u,v)-a;
efficient routing only for a = 1 (and a > 1 in some cases)
15
rank-based model [Liben-Nowel 05]
►
►
►
►
►
►
based on data from LiveJournal
variation of grid-based model to handle non-uniformity
each lattice point has ≥1 people associated with it
local edges to one of people in each neighboring lattice point
long-range edges to random nodes selected from distr. ~1/ranku(v)
ranku(v): rank of v when nodes sorted in increasing dist. from u
delivery time (to lattice point) O(log3n)
16
related results
►
decentralized search with additional information:
a node may “consult” a small number of nearby nodes for free
 [Lebhar-Schabanel ‘04]: paths of O(logn(loglogn)2) steps with O(log2n)
nodes consulted
 [Fraigniaud et al.’05], [Martel-Nguyen ’04]: paths of O(log3/2n) steps by
consulting neighborhood of size log(n) of current node
 [Manku et al.’04]: neighbor of neighbor approach: optimal for some
settings
►
alternative distributions for choosing long-range neighbors:
can we improve routing by
 choosing long-range neighbors from a distribution other than ~1/ρa
 allowing variation in node degrees
 allowing dependence between long-range neighbors of same node?
-
(almost certainly) no: [Aspnes et al. ‘02], [Flamini et al.’05],
[Giakkoupis-Hadzilacos ‘07], [Woelfel ’08?]
what if we make edges to long-range neighbors bidirectional?
17
related results
►
small-world networks on arbitrary underlying graphs: is it possible
to augment any graph such that greedy routing is efficient ?
(greedy: wrt initial graph)




[Fraigniaud ‘05]: yes, for graphs of bounded tree-width
[Duchon et al.’06]: yes, for bounded growth rate
[Slivkin ‘05]: yes, for doubling dimension O(loglonn)
[Fraigniaud ‘05]: no, for doubling dimension >> loglonn
18
applications
►
peer-to-peer networks
 file sharing
►
searching the web
 focused web crawling
►
►
sensor networks
on-line communities
19
Thank you!
20