A statistical construction of power-law networks

International Journal of Parallel, Emergent and Distributed Systems
Vol. 25, No. 3, June 2010, 223–235
A statistical construction of power-law networks
Shilpa Ghadgea*, Timothy Killingbackb1, Bala Sundaramc2 and Duc A. Trana3
a
Department of Computer Science, University of Massachussetts, Boston, MA 02125, USA;
Department of Mathematics, University of Massachussetts, Boston, MA 02125, USA; cDepartment
of Physics, University of Massachussetts, Boston, MA 02125, USA
b
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
(Received 6 July 2009; final version received 15 October 2009)
We propose a new mechanism for generating networks with a wide variety of degree
distributions. The idea is a modification of the well-studied preferential attachment
scheme in which the degree of each node is used to determine its evolving connectivity.
Modifications to this base protocol to include features other than connectivity have
been considered in building the network. However, some of the existing models are
merely formulaic and do not offer an explanation that can be interpreted naturally or
intuitively, while the others require certain information that is not available in many
real-world circumstances. We propose instead a protocol based only on a single
statistical feature which results from the reasonable assumption that the effect of
various attributes, which determine the ability of each node to attract other nodes, is
multiplicative. This composite attribute or fitness is lognormally distributed and is
used in forming the complex network. We show that, by varying the parameters of
the lognormal distribution, we can recover both exponential and power-law degree
distributions. The exponents for the power-law case are in the correct range seen in
real-world networks. Further, as power-law networks with exponents in the same range
are a crucial ingredient of efficient search algorithms in P2P networks, we believe our
network construct may serve as a basis for new protocols that will enable P2P networks
to efficiently establish a topology conducive to optimised search procedures.
Keywords: power-law networks; growing random networks; lognormal distribution;
peer-to-peer networks
1. Introduction
In the last decade, there has been much interest in studying complex real-world networks
and attempting to find theoretical models that elucidate their structure. A recent surge
in activity has often been seen as having started with Watts and Strogatz’s paper on
‘small-world networks’ [33]. More recently, the major focus of research has moved from
small-world networks to ‘scale-free’ networks, which are characterised by having
power-law degree distributions [5]. Empirical studies have shown that in many large
networks, including the World Wide Web [2], the Internet [12], metabolic networks [14],
protein networks [13], co-authorship networks [22] and sexual contact networks [19], the
degree distribution exhibits a power-law tail: that is, if p(k) is the fraction of nodes in the
network having degree k (i.e. having k connections to other nodes) then (for suitably large k)
pðkÞ ¼ ck 2l ;
*Corresponding author. Email: [email protected]
ISSN 1744-5760 print/ISSN 1744-5779 online
q 2010 Taylor & Francis
DOI: 10.1080/17445760903429963
http://www.informaworld.com
ð1Þ
224
S. Ghadge et al.
where c ¼ ðl 2 1Þm l21 is a normalisation factor and m is the minimal degree in the
network.
One of the earliest theoretical models of a complex network, that of a random graph,
was proposed and studied in detail by Erdós and Rényi [9 – 11] in a famous series of
papers in the 1950s and 1960s. The Erdós –Rényi random graph model consists of n nodes
(or vertices) joined by links (or edges), where each possible edge between two vertices is
present independently with probability p and absent with probability 12 p.
The degree distribution of the Erdós –Rényi random graph model is easy to determine.
The probability p(k) that a vertex in a random graph has exactly degree k is given by the
binomial distribution
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
pðkÞ ¼
n21
k
!
p k ð1 2 pÞn2k21 :
ð2Þ
In the limit when n q kz, where z ¼ ðn 2 1Þp is the mean degree, the degree distribution
becomes the Poisson distribution
pðkÞ ¼
z k e2z
:
k!
ð3Þ
The Poisson distribution is strongly peaked about the mean z, and has a tail that decays very
rapidly as 1/k!. The rapid decay of the tail of the degree distribution for the Erdós –Rényi
random graph is completely different from the heavy-tailed power-law nature of the tail of
the degree distribution that is observed in many real-world complex networks.
Consequently, random graphs of the Erdós – Rényi type provide poor theoretical models
for most real complex networks.
The near ubiquity of heavy-tailed degree distributions (such as the power law (1)) for
real-world complex networks, together with the inadequacy of the Erdós –Rényi random
graph as a theoretical model for such networks, brings into sharp relief the fundamental
problem of obtaining a satisfactory theoretical explanation for how heavy-tailed degree
distribution can naturally arise in complex networks. The dominant concept that is
traditionally believed to underlie the emergence of heavy-tailed degree distribution in
complex networks is the mechanism of preferential attachment proposed by Barabási and
Albert [5]. The preferential attachment algorithm is as follows: starting from a small
number (n0) of nodes (which could, for example, be chosen to form a random graph), at
every time step we add a new node with m # n0 edges linking the new node to the m nodes
already present in the network. The probability Pi that the new node will connect to node i
is taken to be proportional to the degree ki of that node, thus
ki
Pi ¼ P ;
j kj
ð4Þ
(The sum in this equation is over the existing nodes of the network at the current state.)
This algorithm leads to a growing random network which simulations and analytic
arguments show has a power-law degree distribution with l ¼ 3. The exponent of the
power law is independent of m, which only changes the mean degree of the network.
An illustration of the differences in both construction and outcomes of random and
scale-free networks is shown in Figure 1.
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
International Journal of Parallel, Emergent and Distributed Systems
225
Figure 1. Construction of random and scale-free networks: (a) Erdós – Rényi random graph (left)
with network size N ¼ 10 and linking probability p ¼ 0:2 and the scale-free construction (right);
(b) degree distributions for ER networks (left) and scale-free networks (right); (c) the appearance of a
random network is rather homogeneous (left), i.e. most nodes have approximately the same number
of links. In scale-free networks (right), a few nodes have a large number of links while the majority
are sparsely connected (one or two links). Both the networks shown contain the same number of
nodes and links. The figure is taken from ‘The physics of the web’ by Barabási [4].
Despite the elegance and simplicity of the preferential attachment process, it has clear
deficiencies as a general explanation for the heavy-tailed nature of the degree distribution
of many complex networks. The most basic deficiency of preferential attachment as a
mechanism of general importance in real-world networks is that it requires a node that is
joining the network to have access to information about the degrees of all the existing
nodes. In most real-world situations, such information is simply not available, and even in
those circumstances in which it is available, it is often quite implausible that new nodes
use this information when making connections. A brief consideration of one real-world
network serves to make the point. The sexual contact network studied in [19] is known
to have a power-law degree distribution. The explanation for this phenomenon according
to the preferential attachment hypothesis is that a new individual will preferentially seek to
have sexual contact with those individuals who already have a large number of sexual
contacts. However, the information about how many sexual contacts individuals have is
typically not publicly available and cannot be used by any preferential attachment scheme,
226
S. Ghadge et al.
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
and even if this information were somehow available, the preferential attachment
hypothesis seems to be a bizarre explanation for human sexual behaviour. Similar
objections to the preferential attachment hypothesis are apparent when one considers other
complex networks, such as the WWW, or metabolic or protein networks.
2. Lognormal fitness attachment protocol
We propose a new explanation for the heavy-tailed nature of the degree distribution of
many complex networks that does not require any knowledge of the degrees of existing
nodes and is essentially statistical in nature.
To motivate the definition of our procedure, we observe that in most real-world
networks the nodes will have an attribute or an associated quantity that represents, in some
way, the likelihood that other nodes in the network will connect to them. For example,
in a citation network (see, e.g. [22]), the different nodes (i.e. papers) will have different
propensities to attract links (i.e. citations). The various factors that contribute to the
likelihood of a paper being cited could include the prominence of the author(s), the
importance of the journal in which it is published, the apparent scientific merit of
the work, the timeliness of the ideas contained in the paper, etc. Moreover, it is plausible
that the overall quantity that determines the propensity of a paper to be cited depends
essentially multiplicatively on such various factors. The multiplicative nature is likely in
this case since if one or two of the factors happen to be very small, then the overall
likelihood of a paper being cited is often also small, even when other factors are not small.
The case of Mendel’s work on genetics constitutes an exemplar of this – an unknown
author and an obscure journal were enough to bury a fundamentally important scientific
paper.
Motivated by this and the other examples, we consider it reasonable that in many
complex networks each node will have associated to it a quantity, which represents the
property of the node to attract links, and this quantity will be formed multiplicatively from
a number of factors. That is, to any node i in the network there is associated a non-negative
real number Fi, which is called the fitness of node i, and which is of the form
Fi ¼
L
Y
fl ;
l¼1
ð5Þ
where each fl is non-negative and real. Assuming that the number of factors, L, is
reasonably large and that they are statistically independent, then the fitness Fi will be
lognormally distributed, irrespective of the manner in which the individual factors fl are
distributed. We recall that a random variable X has a lognormal distribution if the random
variable Y ¼ ln X has a normal distribution. To be specific, from Equation (5), we have
ln Fi ¼
L
X
l¼1
ln fl :
ð6Þ
The central limit theorem implies that, essentially irrespectively of how the factors fl are
distributed, the sum will converge to a normal distribution. Thus, ln Fi will be normally
distributed and Fi is therefore lognormally distributed. For a general discussion
of multiplicative randomness and the lognormal distribution in many areas of science
see [20].
International Journal of Parallel, Emergent and Distributed Systems
227
The density function of the normal distribution is
2
1
2
f ðyÞ ¼ pffiffiffiffiffiffi e2ðy2mÞ =2s ;
2ps
ð7Þ
where m is the mean and s is the standard derivation (i.e. s 2 is the variance). The range of
the normal distribution is y [ ð21; 1Þ. It follows from the logarithmic relation Y ¼ ln X
that the density function of the lognormal distribution is given by
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
2
1
2
f ðxÞ ¼ pffiffiffiffiffiffi e2ðln x2mÞ =2s :
2p sx
ð8Þ
It is conventional to say that the lognormal distribution has parameters m and s when the
associated normal distribution has mean m and standard deviation s. The range of the
lognormal distribution is x [ ð0; 1Þ. The lognormal distribution is skewed with mean
2
2
2
eðmþs Þ=2 and variance ðes 2 1Þe2mþs . Without loss of generality, we assume m ¼ 0;
hence, the fitness distribution is characterised by only a single parameter s. This
2
2
2
lognormal distribution has mean es =2 and variance ðes 2 1Þes .
We now propose our protocol to construct a complex network, which we shall refer to
as Lognormal Fitness Attachment. The parameters for the protocol are
.
.
.
.
s: parameter for the lognormal distribution used to generate the fitness
n0: a small number of nodes to start growing the network with
m # n0 : the number of nodes a node connects to when it first joins the network
n: the number of nodes in the final network
The protocol then works as follows:
. Initially, we assume that the network is a graph with n0 nodes. Each of these initial
nodes has a fitness generated according to the lognormal distribution with parameter
s.
. At every time-step we add a new node to the network. This new node has its
associated lognormally distributed random fitness and m edges are added to link
the new node to m distinct nodes already present in the network. Suppose that the
network currently consists of n0 nodes {1; 2; . . . ; n0 } (n0 , n). The m nodes for the
new node to connect to are selected such that a node i is selected with a probability
Pi proportional to the fitness Fi of that node, thus
Fi
Pi ¼ Pn0
j¼1 Fj
:
ð9Þ
. The protocol stops when the number of nodes in the network reaches n.
The network in its initial state with n0 nodes can be a complete graph, a tree or just a set
of isolated nodes. In general, the initial network can be any graph of n0 nodes. Constructed
by the above protocol, a network of n nodes has mðn 2 n0 Þ þ e0 links, where e0 is the
number of links in the initial network. Thus, the parameter m can be used to obtain any
desirable average degree for the network. We note that the proposed network construction
scheme does not make any use of information about the degrees of the different nodes.
We also note that the only assumption we make about the fitness values is that they are
lognormally distributed. In the next section, we show that by varying only parameter s, the
lognormal fitness attachment model not only results naturally in networks with power-law
228
S. Ghadge et al.
degree distributions with exponent l in the range between 2 and 3, which are seen in many
real-world networks [4], but also in exponential degree distributions, which are also
commonly observed.
Simulation results
In our construction protocol, the fitness Fi associated with each node i in the network
is chosen randomly from a lognormal distribution with parameters m ¼ 0 and s.
The parameter s is the most important parameter, which changes the skewness of the
distribution. In order to unambiguously correlate the fitness distribution with properties
of the resulting network, in our evaluation, we have kept all parameters fixed, except s.
In each case shown, the network is built from a small initial template of node size n0 ¼ 3,
each with two edges. Each new node comes with two edges, which means m ¼ 2. Though
we have considered networks of various sizes N, the results shown correspond
to N ¼ 200; 000. Varying the only remaining parameter s results in a wide range of
network degree distribution properties.
Figure 2 demonstrates the fact that our statistical protocol recovers all recognised
network configurations. Three representative values of s and, in each case, the original
fitness distribution and the resulting network degree distribution are shown.
In Figure 2(a),(b), we see the case of small s (s ¼ 0:1). The fitness distribution makes
clear the feature that, in this case, the variability in fitness is restricted. This closely
(b) 105
(a)
(c)
4
4
10
2
103
2
10
1
0.5
1
1.5
2
0
5
Fitness
10
15
20
(d) 105
(e) 106
4
104
10
PDF
103
102
100
Connectivity
5
102
15
(f)
5
2
10
10–2
10–8
10
Fitness
100
101
101
0
Connectivity
log(Count)
0
0.4
0.2
101
0
0.8
0.6
PDF
Count
PDF
3
Count
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
3.
4
3
2
1
0
10–6
10–4
Fitness
10–2
100
5
10
15
20
Connectivity
Figure 2. Changing degree distributions with varying parameter s for networks with N ¼ 200; 000
nodes. In each case the side-by-side panels show the fitness distribution and the corresponding
degree distribution which results: (a) lognormally distributed fitness for m ¼ 0 and s ¼ 0:1;
(b) resulting degree distribution corresponding to (a). Note log-linear scale that indicates exponential
fall-off. This case would correspond to a random graph; (c) an increasingly skewed distribution with
m ¼ 0 and s ¼ 1:5 and (d) corresponding degree distribution plotted on a log– log scale, clearly
showing power-law behaviour. The corresponding network would be scale free; (e) extreme skewed
distribution for m ¼ 0 and s ¼ 9. Here, the probability of getting a single small value is very high
(nearly one), resulting in (f) a distribution where one node completely dominates the network.
The fact that the highest degree here is 199, 520 makes clear the common reference to as a ‘winnertake-all’ network.
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
International Journal of Parallel, Emergent and Distributed Systems
229
resembles a growing random network construction, an expectation borne out by the
resulting degree distribution which exhibits clear exponential behaviour.
On making s larger, the fitness distribution allows for a wider range of values which
dramatically changes the degree distribution of the resulting network. Here, we have a
power-law network, which for s ¼ 1:5 exhibits an exponent around 3, which is the same
as that obtained from the Barabási construction.
On increasing s to a more extreme value, say 9, we see that most nodes will have small
fitness values while very few will get values from the tail. This results in a distribution
where a single node dominates the network in terms of connectivity. In the case shown,
199, 520 (out of a possible 199, 999) connections are made to a single node. The next most
connected node has a mere 149 connections. This, monopolistic, winner-takes-all outcome
has also been noted in some contexts [4]. These results clearly indicate that our protocol
recovers the spectrum of networks reported in the literature.
Figures 3(a),(b) and 4 show the networks that result as s are increased from small to
large values. In Figure 3(a), a small value of s leads to a network similar to that obtained
when the attachment probability is kept approximately constant. We note that the network
shown is akin to that displayed by neural networks of Escherichia coli bacteria and the
electric grid of Southern California [3]. Figure 3(b) shows the other end of the spectrum of
behaviour where for large s, a few nodes dominate the attachment in the network. Though
this is not as clear as the case shown in Figure 2(f), it nevertheless illustrates the
winner-take-all behaviour indicated earlier. Intermediate values of s lead to networks like
those shown in Figure 4, where clustering is seen on multiple scales; a characteristic of
power-law behaviour.
The transition between the regimes of exponential and power-law degree distributions
is not discontinuous. This is seen from Figure 5, where the degree distribution is plotted on
a log-linear scale for a range of s values. A seamless transition from random to scale-free
network behaviour, which coincides with the degree distribution changing from
exponential to power law, is seen with changing s.
Further, the power-law exponent can also be controlled by further changing s. This is
seen from Figure 6 where, over the range s ¼ 1:5 – 3:5, the power-law exponent varies
from approximately the Barabási value of 3 to 2. Therefore, the scheme we propose has the
added advantage of allowing a measure of tunability in the degree distribution of the
Figure 3. Resulting network for (a) s ¼ 0:01 which reflects the homogeneity associated with
random networks, (b) s ¼ 9 which represents the opposite extreme case of large s, which shows a
clear domination of a few sites that have high degree.
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
230
S. Ghadge et al.
Figure 4. For intermediate values of s, the networks display clumping on multiple sites, a signature
of power-law behaviour in the degree distribution. The networks shown exhibit power-law degree
distributions with different exponents. The respective s values are (a) 1.5, (b) 1.6, (c) 3 and (d) 4.
network. As discussed in the following section, this could prove useful for applications in
the context of peer-to-peer networks.
4.
Related work and application to P2P networks
The preferential attachment mechanism by Barabási and Albert results in a power-law
degree distribution with l ¼ 3. This mechanism represents a small subset of scale-free
networks in the real world. A different generative model for scale-free networks is the
copy model studied by Kumar et al. [18], in which the new node chooses an existing node
at random and copies a fraction of the links of the existing node. This model is similar to
the notation of ‘triadic closure’, also a key issue in sociology: links are much more likely
to form between two people when they have a friend in common [16]. The copy model
assumes that the new node knows whom the chosen existing node is connected to. This
assumption is reasonable for web graphs for which the model is originally proposed.
However, the assumption is not always reasonable in other circumstances, as is clear from
the case of sexual contact networks discussed in Section 1.
In order to produce a wide range of power-law exponents, the generative models in
[6,7,32] also suggest the idea of associating each node with an intrinsic fitness, like our
International Journal of Parallel, Emergent and Distributed Systems
σ = 0.1
σ = 0.2
σ = 0.6
σ=1
σ = 1.5
105
Count
231
104
103
4
6
8
Connectivity
10
12
Figure 5. The seamless evolution from exponential to power-law degree behaviour is seen with
increasing s. The fact that the graph is plotted on a log-linear scale makes this transition transparent.
The curves have been shifted vertically for clarity.
proposed model. The common approach in these efforts is that given any distribution from
which fitness is chosen, the probability to connect a pair of nodes is a function of their
corresponding fitnesses, which can be formulated based on the fitness distribution to
ensure the scale-freeness of the network. In spite of their mathematical beauty, these
models are based on sophisticated formulas which are difficult to interpret naturally or
intuitively. Our model is much simpler as we explicitly propose that (1) fitness is
lognormally distributed and (2) the linking function is based on pure preferential
Barabasi
σ = 1.5
σ=2
σ = 2.5
σ=3
σ = 3.5
105
104
Count
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
2
103
102
101
100
10
20
30
40
50
Connectivity
Figure 6. Changing power law with increasing s. These log– log distributions clearly show a
changing power-law exponent with a value around 3 for s ¼ 1:5 and one closer to 2 for s ¼ 3:5. The
degree distribution of the Barabási construction, which leads to an exponent of 3, is also shown as a
basis for comparison.
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
232
S. Ghadge et al.
attachment (attractiveness to new connections is proportional to fitness). Both of these
hypotheses are reasonable and natural.
There have been a large body of work on methods aimed to construct a network that
satisfies some given criteria other than a power-law degree distribution. Noticeably, as the
small-world phenomenon [33] is observed in many real-world networks and is a property
that makes search in networks more efficient, a number of methods have been proposed to
generate networks that have this property. Starting from an initial (static) network, a
popular approach to make it a small-world network is to follow a rewiring process in
which random long-range links are added to connect a node to a random far-flung node
[8,15]. Compared to this line of work, our work has a different purpose as it is aimed to
generate power-law networks, not small-world networks.
Our proposed model could be useful to design efficient P2P search networks. A P2P
network consists of a large number of nodes (i.e. computers) that operate in a decentralised
fashion to enable resource-sharing services or to accomplish global tasks for a community
of users. P2P networks are categorised into two main classes: structured or unstructured.
In a structured P2P network, the nodes are connected to each other according to a
predefined overlay structure which needs to be maintained over time. Examples of such a
structure are the well-known Distributed Hash Tables CAN [24], Chord [31] and Tapestry
[34]. On the contrary, an unstructured P2P network is formed such that links between
nodes are created arbitrarily. Although structured P2P networks enable efficient search,
unstructured P2P networks have their own advantages. In the latter networks, since there
requires no structure to dictate how nodes are connected, the cost of structure maintenance
is avoided. In addition to being much easier to construct and manage and sometimes
considered more robust to highly dynamic environments, unstructured networks allow the
data and queries to be described in any rich form, unlike structured P2P networks which
usually require some certain structure-dependent format for the data and queries. Most
existing P2P networks are unstructured [30], including popular file-sharing networks such
as Gnutella [17], Limewire [26] and Morpheus (Homepage, http://www.morpheus.com/
index.html), and many P2P streaming networks such as the highly popular PPLive (http://
www.pplive.com/).
Despite the popularity of unstructured P2P networks, a major challenge is the
development of systematic protocols for determining topological characteristic of P2P
networks. The importance of this problem comes from the fact that efficient search
algorithms for a P2P system depend on the network having a power-law structure with
exponent between 2 and 3. For a P2P network of size N with such a power-law topology,
the percolation search algorithm of [29] is capable of finding any content in the network
with probability one in time ðO log NÞ.
Although it is known from empirical studies [25,27] that existing P2P networks have
approximate power-law degree distributions, this structure appears to have arisen in an
ad hoc fashion and, thus, it is important to have client-based protocols that are guaranteed
to produce global P2P networks with a predictable power-law topology. One such protocol
has been proposed in [28]. We believe that our construction of power-law networks with
tunable exponents in the range between 2 and 3 provides an alternative basis for a new
client-based protocol that guarantees the emergence of a global topology for P2P networks
that is suitable for implementing efficient search algorithms, such as that of [29].
The client-level protocol that emerges from our construction of power-law network is
straightforward in principle. Each node in a P2P network has a locally generated random
variable, the node’s fitness, which is drawn from a lognormal distribution, with a globally
specified value s. New nodes entering the network then connect to existing nodes using
International Journal of Parallel, Emergent and Distributed Systems
233
Table 1. Mean search times for random walk search on three networks of N ¼ 10,000 nodes
generated using lognormal fitness attachment, with s ¼ 2,3,4, and the corresponding search times on
three reference random power-law networks with the same number of nodes and degree sequences
constructed using the configuration model [23].
s
Lognormal
Random
2
3
4
17,445
18,476
18,696
20,393
19,408
21,668
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
Note: In all cases, the mean search times were calculated using 100,000 randomly selected start –finish pairs.
The results support the conjecture that lognormal fitness attachment generates a network with a topology more
amenable to search than a random power-law network with the same degree distribution.
the lognormal fitness attachment protocol described earlier. As shown, this protocol will
automatically result in a P2P network with the global topology of a power-law network
and with exponent in the range between 2 and 3. As such, this P2P network will therefore
be tailored for the percolation search algorithm [29].
In addition to constructing power-law networks with an exponent in the range
suitable for efficient search algorithms, we conjecture that the lognormal fitness
attachment protocol naturally constructs a power-law network with a global topology
that allows more efficient search than a random power-law network with the same
degree distribution. This conjecture is supported by the result of random walk search
[1] on networks generated using our protocol and on random power-law networks
with the same degree sequence. Table 1 shows the results for the mean search time
for random walk searches on three networks of N ¼ 10; 000 nodes constructed using
the lognormal fitness attachment protocol and for three reference networks with
identical nodes and degree sequences which were constructed using the configuration
method [23]. We note that in all cases the random walk search is more efficient on
the networks generated by our protocol than on random networks with the same
degree sequence. Testing the efficiency of the percolation search algorithm [29] on
networks constructed using our protocol is an interesting question currently under
consideration.
Our work is somewhat related to the work of Schmid and Wattenhofer in [30] and
others alike (e.g. [21]), which also provides a P2P design for efficient search in
unstructured networks. The fundamental difference, however, is that while their method
applies a rewiring algorithm on top of an existing network (by adding new links between
nodes to improve search), we have proposed a generative model to form power-law P2P
networks with intrinsic properties that are conducive to efficient search, without
requiring any rewiring. To our knowledge, this feature of the proposed method is
unique.
5.
Conclusion
We have suggested a simple statistical method for generating networks with a wide range
of degree distributions. A single statistical parameter governs the final outcome which can
range from exponential to power law to a monopolistic (winner-takes-all) network.
Further, the method does not utilise any information on the connectivity of the existing
network structure as the network is built. This protocol naturally constructs power-law
networks which are well suited for efficient search. The protocol may also perhaps be
useful in changing the characters of an already existing network. This and related issues
are currently under consideration.
234
S. Ghadge et al.
Acknowledgement
This work was supported in part by the National Science Foundation under award number
CNS-0753066.Any opinions, findings and conclusions or recommendations expressed in this
material are those of the author(s) and do not necessarily reflect those of the National Science
Foundation.
Notes
1.
2.
3.
Email: [email protected]
Email: [email protected]
Email: [email protected]
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
References
[1] L. Adamic, R. Lukose, A. Puniyani, and B. Huberman, Search in power-law networks,
Phys. Rev. E 64 (2001), p. 046135.
[2] R. Albert, H. Jeong, and A.L. Barabási, Diameter of the world-wide-web, Nature 400 (1999),
pp. 107–110.
[3] L.A.N. Amaral, A. Scala, M. Barthelemy, and H.E. Stanley, Classes of small-world networks,
Proc. Natl Acad. Sci. USA 97 (2000), pp. 11149– 111520.
[4] A. Barabási, The physics of the web, Phys. World 14(7) (2001), pp. 33 – 38.
[5] A.L. Barabási and R. Albert, Emergence of scaling in random networks, Science 286 (1999),
pp. 509–512.
[6] C. Bedogne and G.J. Rodgers, Complex growing networks with intrinsic vertex fitness, Phys.
Rev. E 74 (2006), p. 046115.
[7] G. Caldarelli, A. Capocci, P.D.L. Rios, and M. Munoz, Scale-free networks from varying vertex
intrinsic fitness, Phys. Rev. Lett. 89(25) (2002), p. 258702.
[8] A. Chaintreau, P. Fraigniaud, and E. Lebhar, Networks become navigable as nodes move and
forget, in ICALP ’08: Proceedings of the 35th International Colloquium on Automata,
Languages and Programming, Part I, Reykjavik, Iceland Berlin, Springer-Verlag, 2008,
pp. 133–144.
[9] P. Erdós and A. Rényi, On random graphs, Publ. Math. 6 (1959), pp. 290– 297.
[10] P. Erdós and A. Rényi, On the evolution of random graphs, Publ. Math. Inst. Hungarian Acad.
Sci. 5 (1960), pp. 17 – 61.
[11] P. Erdós and A. Rényi, On the strength of connectedness of a random graph, Acta Mathematia
Scientia Hungary 12 (1961), pp. 261– 267.
[12] M. Faloutros, P. Faloutros, and C. Faloutros, On power-law relationship of the internet
topology, ACM SIGCOM 99, Comp. Comm. Rev. 29 (1999), pp. 251– 260.
[13] H. Jeong, S. Mason, R. Oltvai, and A. Barabási, Lethality and centrality in protein networks,
Nature 411 (2001), p. 41.
[14] H. Jeong, B. Tombor, R. Albert, N. Oltvai, and A. Barabási, The large-scale organization of
metabolic networks, Nature 607 (2000), p. 651.
[15] J. Kleinberg, The small-world phenomenon: An algorithmic perspective, Tech. Rep., Ithaca,
NY, USA (1999).
[16] J. Kleinberg, The convergence of social and technological networks, Commun. ACM 51
(2008), pp. 66 – 72.
[17] T. Klingberg and R. Manfredi, Gnutella protocol development (2002). Available at http://
rfc-gnutella.sourceforge.net/src/rfc_06-draft.html.
[18] R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal, Stochastic
models for the web graph, in FOCS ’00: Proceedings of the 41st Annual Symposium on
Foundations of Computer Science, IEEE Computer Society, Washington, DC, 2000, p. 57.
[19] F. Liljeros, C.R. Edling, L.A.N. Amaral, H.E. Stanley, and Y. Aberg, The web of human sexual
contacts, Nature 411 (2001), p. 907.
[20] E. Limpert, W.A. Stahel, and M. Abbt, Log-normal distribution across the sciences: Keys and
clues, BioScience 51(5) (2001), pp. 341– 352.
[21] S. Merugu, S. Srinivasan, and E. Zegura, Adding structure to unstructured peer-to-peer
networks: The use of small-world graphs, J. Parallel Distrib. Comput. 65 (2005), pp. 142–153.
Downloaded By: [Tran, Duc] At: 20:36 21 April 2010
International Journal of Parallel, Emergent and Distributed Systems
235
[22] M. Neuman, The structure of scientific collaboration networks, Proc. Natl Acad. Sci. USA 98
(2001), pp. 404– 409.
[23] M.E.J. Newman, S.H. Strogatz, and D.J. Watts, Random graphs with arbitrary degree
distributions and their applications, Phys. Rev. E 64 (2001), p. 026118.
[24] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, A scalable content addressable
network, in ACM SIGCOMM, San Diego, CA, August, 2001, pp. 161–172.
[25] M. Ripeanu, I. Foster, and A. Iamnitchi, Mapping the gnutella network: Properties of largescale peer-to-peer systems and implications for system design, IEEE Inter. Comp. J. 6 (2002),
p. 2002.
[26] C. Rohrs, Limewire design. Available at http://www.limewire.org/project/www/design.html.
[27] S. Saroin, P.K. Gummadi, and S.D. Gribble, A measurement study of peer-to-peer file sharing
systems, Proceedings of Multimedia Computing and Networking (MMCN) (2002).
[28] N. Sarshar and V. Roychowdhury, Scale-free and stable structures in complex ad hoc networks,
Phys. Rev. E 69(2) (2004), p. 026101.
[29] N. Sarshar, P.O. Boykin, and V.P. Roychowdhury, Percolation search in power law networks:
Making unstructured peer-to-peer networks scalable, Proceedings of the 4th International
Conference on Peer-to-Peer Computing (2004), pp. 2 – 9.
[30] S. Schmid and R. Wattenhofer, Structuring unstructured peer-to-peer networks, in 14th Annual
IEEE International Conference on High Performance Computing (HiPC), Goa, India, LNCS
4873, Springer, Berlin, December, 2007.
[31] I. Stoica, R. Morris, D. Karger, M. Kaashock, and H. Balakrishman, Chord: A scalable peer-topeer lookup protocol for internet applications, in ACM SIGCOMM, San Diego, CA, August,
2001, pp. 149– 160.
[32] D. Vito, G. Caldarelli, and P. Butta, Vertex intrinsic fitness: How to produce arbitrary
scale-free networks, Phys. Rev. E 70 (2004), p. 056126.
[33] D. Watts and S. Strogatz, Collective dynamics of ‘small-world’ networks, Nature 393 (1998),
pp. 440–442.
[34] B.Y. Zhao, L. Huang, J. Stribling, S.C. Rhea, A.D. Joseph, and J. Kubiatowicz, Tapestry:
A resilient global-scale overlay for service deployment, IEEE J. Selected Areas Commun.
22(1) (2004), pp. 41 – 53.