QoS Oriented Dynamic Replica Cost Model for P2P Computing*
Feng Mao, Hai Jin, Deqin Zhou, Baoli Chen, Li Qi
Cluster and Grid Computing Lab
Huazhong University of Science and Technology, Wuhan, 430074, China
{Hjin, fmao}@hust.edu.cn
Abstract
Replication on multiple nodes is an effective way to
improve the availability in the P2P or grid
environment. It is difficult to determine how many
replicas can fulfill the user request for availability QoS,
because of the undependable peers, networks and
uncertainty of user access mode.
We emphasize the availability QoS in the way that
replica number should adapt to both the fluctuant user
demands on service capability and system enviroment
undependability, when determining replica number.
Aiming to supply service to users with availability QoS
at a low replica cost, we introduce a minimum replica
cost model to predict the optimal replica number and
dynamically control the replicas number and the cost.
To mask the undependability of our P2P system, we
also regard replication as a redundant mechanic.
In our model, a simulation is analyzed to describe
the prediction of optimal replica number and we
compare the cost and availability in our system to
those in others under different cost strategies. The
result shows that our system supplies a better
availability QoS at an optimal cost by its adaptability
to both system undependability and user access mode.
The cost strategies in our modle can be flexiblly
reconfigured according to the replica service
providers in our system.
1. Introduction
This paper emphasizes the number of replicas
needed in P2P system to fulfill the user’s request. A
P2P system is characterized by such applications that
employ distributed resources to perform functions in a
decentralized manner. From the resource sharing view
point, a P2P system can overlap with a grid system
[2][4]. The plenty of resources in P2P environment can
supply a lot of potential available resources as the
*
redundant mechanism to improve replica availability.
Many topics on availability QoS in P2P or grid
system emphasize the system dependability, which is
just a part of the service QoS to users and neglects
volume of service capability to meet the users’
requests. Threrefore, this paper aims to construct a
high availability multiple replica system based on P2P
technology, concerned with both the variation of user
access mode and the undependability in P2P system.
The more replica resources are assigned to certain
data or service, the easier for our system to guarantee
the QoS of availability and decrease cost at the user
complaints to the unavailable replica service (QoS).
The increase of the replica number incurs increasing
cost at the system overhead such as replica storage and
management and. The redundancy scale or the number
of replicas needs to be controlled.
A replica QoS cost predication model is set up to
balance the merits of availability QoS and cost on
replica overhead. We put the number of replicas on the
balance point according to cost strategies made by
replica service provider.
Further more, to mask undependability in P2P
system, we still use a few more replicas by Poisson
distribution model, after predicting a proper number of
replicas to achieve availability QoS through a random
queue and optimal expectancy model.
We formalize our question into one target function: f
(|R|, R, Cost, Availability) as described in [9]. |R|: a
scalar, the number of replicas; R: a vector refers to the
place of the replicas; Cost: the replicas overhead and
user compliant; Availability: the system ability to
fulfill the users’ demands, including both system
adaptive capabilities to the fluctuant user access mode
and the undependable P2P environment.
The rest of the paper is organized as follows. We
introduce some background on P2P and replication
topics in section 2. In section 3, we shortly describe the
This work is supported by National Nature Science Foundation under grant 60273076.
structure and topology in our systems to organize and
locate resources. Section 4 focuses on the setup of our
multi-replica QoS cost model to predict the best
number (|R|) of replicas. In section 5, we discuss the
simulation study and comparison. In section 6, we talk
about our further work for our system. Section 7 draws
the conclusion.
2. Background and relative work
Decentralized control would provide good system
scalability. The plenty of available redundant resources,
at the same time, helps to eliminate the single point of
failure in the system and provide higher availability.
Direct communication between peers and dynamic
sources assignment optimize the whole system
performance. Despite of the benefits from the P2P
system mentioned above, the dynamic system status
and essentially decentralized control which
characterize the system bring us a great challenge to
make good use of and introduce it into commercial
application. Tremendous nodes unpredictably joining
and leaving the system make it difficult to obtain an
accurate global status. All of these fundamentally
prevent us constructing an easily used, dependable and
high performance P2P system or application.
Though researches focused on basic QoS strategies
such as delay, bandwidth and dependability have had a
great achievement, they just focus on improvements of
the system dependability by redundancy mechanics [3],
e.g., the failure of the peers and networks, the peers’
unpredicted joining and leaving, the inaccuracy of
resource location, but to offer the service availability
QoS in the real world, a successful system should also
adjust the service capability to meet user demands
dynamically.
Replication is often used on the improvement of
performance [4] and dependability [6]in distributed
systems. The replication management strategies, such
as replica placement, transfer, creation, locating, and
selection, are seldom based on the P2P mode. We are
trying to adopt replication technology to increase the
service availability in our P2P prototype.
A replication model for a larger P2P system is
provided in [5] to improve the availability in the
undependable environment. Each peer alone computes
the current availability for certain content of data or
service and decides the number of its replicas. Their
model is proved to be an effective way to guarantee the
availably QoS for certain data or service, but it only
masks the system undependability. At most one replica
is available and it can not satisfy the user demands on
replica service capability.
Replication strategies in [9], describe the effect of
the number and the placement of replica(s) in a random
topology to the availability QoS for widely distributed
systems and the relation is constructed between the
replicas and the system undependable parameters, such
as the peer or network failure, the accuracy of hit or
locating. The topology is far from practice.
In [8], the availability QoS is decupled into the
request availability QoS and the supply availability
QoS. Compared to [9], the researchers begin to
emphasize that the availability QoS is affected by both
request and supplier, but they have no strategies to
guarantee the availability QoS to follow the user
access mode and demands.
On summary, we in this paper consider both the
system undependability (peer failure/recover, join/quit)
described in [5] [8] [9] and the system capability to
provide service on user demand in our P2P system.
3. Architecture and resources management
3.1. Resource organization and locating
In general, the resource organization and
management in P2P or distributed systems can be
classified into the centralized, the decentralized and the
unstructured, the decentralized but structured [1].
Accordingly, the main methods and ideas used to
locate and manage resources decouples into the
dedicated locating system in which some nodes are
dedicated to locating, and a “pure” P2P system, in
which each peer has a locating function besides its
roles as client, server [6].
Our P2P system uses a hybrid topology, exhibited in
Figure 1, to reduce the cost to manage and locate
replica resources. Then the roles of a peer will be
expended to a client, a server, a locator and a manager
which are assigned to the peers under different context.
Peers in top level have the function of locator and
manager and is called a super-peer.
Peers in our prototype system are divided in two
dimensions. Peers in horizon are separated into
domains according to different organizational
strategies such as peer physical geographic distribution,
logical VO (a high level strategy) [4].
In each domain, peers can directly locate others in
the same domain by hash index table and broadcasts.
Frequent and fragment local locating information
exchanges are localized. In hierarchy, from each
domain, a limited number of peers are selected as
super-peers to make up the upper level of a tree,
performing the responsibility to exchange locating
information with other domains and to manage replicas
of certain data/service content.
Figure 1. topology structure and dimension
divisions
This way of resources organization and management
decreases the total system overhead and reduces the
communication scale for locating in the whole system.
Normally the iteration level of tree structure is not
more than 3, the average depth of DNS. Our system
uses bloom technology [5] and the levels of the
topology can be decreased to 2.
3.2 Replica management and placemet
The fully distributed decision of the number of
certain replica [3] does not suit, because that method is
only conscious not conscious to the global user and
system status. The supper peers are responsible to the
monitoring, prediction and decision work for certain
content or service by exchaging global information
among them.
Though management for specific content replicas is
centralized in one peer, the management of different
kinds of replica is distributed into different super-peers
in different domains. The load on replica management
is decentralized. The replica management includes: 1)
gathering the current system and access mode
parameter 2) computing to predict the number of
replica and cost to certain content 3) controlling to
create/move replica 4) registering for replica 5) replica
life cycle management.
To avoid the number of replica vibrating violently
with the varying user access mode, which would cause
the replica recreation and bandwidth consumption, we
use a management of dynamic replica life cycle, just as
the grid service life cycle management by marking
replica as “inactive” or “active”.
The replica placement is classified as NP-hard
discrete location problem [7]. Random placement is a
typical unenlightened algorism [9]. We adopt this
easily implemented algorithm to place the replica on
the available peers
4. Availability definition and cost model
4.1. Define of availability and problem solution
We define the QoS of availability as the capability
of the target system to fulfill the user demand on
service capability and service dependability. We
present a replica perdition model from the viewpoint of
both internal system dependability and external user
demand on service capability.
To make the problem easier analyzed, we divide our
solution of the replica number problem in our system
into three components according to the three steps:
queue model, optimization expectancy model, system
dependability model. We exhibit the three components
in each column of table 1, noted by several key
parameters and their relations in the model.
Firstly, we use M/M/n model to convert the user
access strength, which can be easily obtained from
system log, into the random number of replica
requested by users. Then a replica prediction model is
set up to balance the availability QoS merits from
replicas and the costs in a long term. By introducing
probability analysis method, we try to find how many
replicas are suitable to supply enough service
capability to users at a least cost expectancy. Lastly,
Table 1. Replica and cost model components and main parameters
Component
Analysis model
Distribution
function
Main
parameters
Parameters
External demand
M/M/n/n queue model
Stable distribution
n:
number
of
replicas/server
1/µ: average service time
λ: access strength
PL: loss ration of requests,
it present the whole system
availability
n: the least number of
replicas to achieve PL
|R| prediction
P2P internal environment:
Statistic and extremism
P(n}: Distribution of
replica demand
number
Cr: basic cost of replica on storage and
management
C: the cost (penalty) for the replica less than
the user demand
C1: the cost for more replica than the real
user demand
n: theoretical prediction replica number on
user demand
n0: the best prediction replica number at the
lowest cost
Live and death process and Statistic
Binomial distribution
P_dependability: the dependability of m
undependable replicas;
P_stability: peer stability
LC: the ability of our system to correctly
locating
m : the replica number to support certain
PL
m: the replica number with redundancy to
guarantee n0 and PL
because of undependability of the P2P environment,
we use a few more replicas as a redundant mechanism
to mask the system failure and guarantee the service
availability of the replicas. After these steps, we find
the proper replica number in our P2P environment to
guarantee QoS of availability according to history log.
Before we go to the details, a few assumptions and
simplifications are described, which would help to
grasp critical problems but not affect their validity. 1)
Each replica (data/service) can serve only one request
or access at a time. If in a more complex circumstance,
one replica can supply more service volume for user
requests at a time, a volume factor can attach to the
replica number in our model.2) The user access mode
can be acquired by the statistic of history trace to
certain application. Despite of the random of access
frequency, its distribution or the access mode exists.3)
The dependability of the elements in a P2P system
such as net and nodes can be attained by statistic.4)
The accuracy of certain resources locating mechanics
can be obtained by algorism analysis or simulation.
4.2. Queue model: assess replica number
requests by user access strength
In our system, the distribution of user requests to
certain content of replica can be regarded as poison
distribution because the requests from different users
are independent. The service time on each replica (data
/service) varies from one user to another and is decided
by specific applications. Thus our system can be
modeled on the M/M/n/n theory. The parameters we
used are listed in the first column in table1.
PL = P (n) =
⎛ λ ⎞
⎜⎜ µ ⎟⎟
⎝
⎠
n!
n
*
⎡
⎢
⎢ ∑n
⎢k =0
⎢
⎢⎣
−1
k
⎛ λ ⎞ ⎤
⎜⎜
⎟⎟ ⎥
⎝ µ ⎠ ⎥
k! ⎥
⎥
⎥⎦
(1)
The value of µ is related with application not only
to the type of services: web, ftp and web service, but
also to the content of the services: file and procedure.
To control the total replica cost but guarantee the QoS
of availability, the least replica number n is predicted.
We will calculate n (t) (replicas number) though the
relation between λ (t) and n (t) by equation (1).
of replica number, n (t) converted from history access
log in last section is used to predict the behavior of
user and to calculate the expectant replica number, n0 (t),
in specific hour in a day, which our system should
supply to users to guarantee QoS of availability.
The cost of the replica, C, has different meaning in
different context including storage cost; transfer cost;
replica management cost. We regard the cost from user
complaint for unavailable replica service and pool QoS
of availability as custom cost.
By abstracting these concrete costs, we introduce a
cost vector [C, C1, Cr] on the application level from
the view of service providers: “C”, the cost caused by
the replica number not fulfilling user real demand;
“C1” the cost from predicting and providing more
replicas than the real replica number that user demand;
“Cr”, the basic cost of replica management maintain.
When fewer replicas than user real demand are
predicted and provided by our system, the cost mainly
includes the costumer or user complaint for the
unavailable service; while more replicas than real user
demand are predicted and provided by the system, the
cost will mainly come from more replica management
overhead and more resource consumption. In this
context, “C” and “C1” mean the penalty for the
deviation of the predication replica number from the
real replica number that user demands.
TC (n): the cost expectancy of n replicas.
Cost (n0): the cost function when the system
supplies replica number of n0.
P(r): The function of distribution to replica request
number, which can be obtained from the history log.
The number of replica, n(t) should coordinate with λ(t)
in equation (1). When replicas supply service, it has
costs from C =[C, C1, Cr]. We assume that the
prediction replica number, n0(t) has the least cost,
which rightly match user demand.
n < n0 at time t
⎧(n0 - n ) * C1 + n0 * Cr
Cost(n0) = ⎨
n >= n0 at time t
⎩(n - n0 ) * C + n0 * Cr
n0
∑
=
TC(n0)
[( n 0 − n ) * C 1 + n 0 * Cr ] p ( r )
r=0
+
∞
∑
[( n − n 0 ) * C 1 + n 0 * Cr ] p ( r )
r +1
TC (n0) =
∫
n0
0
+
4.3 Replica number prediction and cost model
When dTC
dλ
In this section we introduce cost model and
parameters in the second component, which are listed
in the second column of table 1. The random variable
∫
n0
0
[ (n0 - n ) * C1 + n0 * Cr ] p ( λ ) d λ
∫
∞
n0
[ (n - n0 ) * C + n0 * Cr ] p ( λ )d λ
= 0 , TC will be the least. We will get
p( γ )d λ =
C - Cr
C1 + C
(2)
From (2), we can calculate and predict the best expect
replica number, n0(t), to specific application.
4.4. P2P system dependability and stability
Because of the unpredictable resources sharing
mode that peers frequently join and quit in system, we
introduce modle and parameters listed in the third
column in table 1 to guarantee system dependability
mechanics. Our system provides little more replicas
(“m”) than the user demand number (“n”) in order to
mask the instability and undependability and guarantee
at least “n” replicas available to users at same time.
P_peer = P_stability* P_dependability *P_link* LC
P ( n ) = AV
=
m−n
∑ C ( P _ peer ) (1− P _ peer )
i=0
n+i
n
m −i
managed by a super-peer randomly selected in the top
level of our system.
The figure 1 presents the structure. The super-peers
dedicated to manage replica are thought reliable, while
the ordinary peers are undependable. The availability
of each replica on a sole peer is set to 0.8, which
includes the problems from peer quit and failure,
network and locating failure.
The prediction model is based on the predictability
of user access mode. The access mode exists and is
exhibited by the λ (t) and n(t) in figure 2 and figure 3.
Figure 2, the user access intensity log in 24 hours
during four months, exhibits the pattern of λ (t).
(3)
m
For each peer is independent to the other, it fits for
the Binomial distribution model to describe the
relations between “m0(t)” and “n0(t)”. The equation (3)
reflects relation between the dependability and replica
number in our P2P system. Generally, if Av>0.9, the
dependability of replica number that users demand is
guaranteed.
4.5. Put the three models together
We can calculate the proper number of replicas to
availability QoS guarantee at a low cost in our P2P
system, when combining of equation (1) (2) (3) to
construct the target optimization function f (P (r), λ (t),
C, TC, m(t) , PL).
Optimization function:
f (P (r), λ (t), C, TC, PL0, Av0): (4)
(1), (2), (3)
Prior condition/ requirement:
min (TC), PL<PL0, Av >Av0
Conclusion: m0(t ) =|R|
5. Simulation study for our P2P prototype
5.1. Simulation environments and parameter
settings
We simulate the P2P environment by three domains
each of which contains 50 peers and we constrain that
each peer can have only one replica to a specific
content or service. The number of peers is enough
according to our random access trace.
All the requests that come to a specific content of
replica (or service) and all the replicas are centrally
Figure 2. User requests intensity in 24 hours
during 4 months
Figure 3. Average request number of
replicas, n (t), in each of the 24 hours, in 4
different weeks and their total average value
In the simulation, we describe the affection of
parameter settings on the number of replica, the
average replica cost and QoS of availability. The
parameter settings in the vector C reflect different cost
strategies and preferences of service provider between
system cost and QoS in our system. We set different
groups of C [C, C1, Cr] parameter configurations,
described in section 4. In each group, the ration
between C and C1 represents the emphasis on system
replica cost or on service satisfaction from users (QoS).
The weight of Cr indicates whether we care about the
basic replica cost, when owning plenty of resources in
P2P environment.
According to the user access intensity, we use the
middle number as the service rate of replica, µ. To
mask the unreliability in the P2P environment, we use
a little more redundancy to guarantee that certain
number of replicas working properly. We set the Av=
0.9 to guarantee the availability of the “n” replicas is
more than 0.9. The parameter PL is measured as the
QoS of availability from the view point of users. We
take the value of Av and PL as the system internal
availability and integral QoS of availability.
5.2. Notations in figures
In the following figures, the line noted by string “n”
represents the real replica number demand from users.
The line noted by a string “n” followed by three
numbers represents predication number of replica by
our system and the number strings present different
cost strategies. For example, “n110” means that our
system will predict number of replicas by the
parameter setting, C: C1: Cr=1:1:0 in our cost model.
Accordingly, the line noted by the string “m” followed
by three numbers means the replicas number with
more redundancy to mask the undependability and
guarantee availability.
5.3. Prediction number of replicas under
different cost paramete
Figure 4 Average replica number requests in
a time slot, average prediction of number
replica under different cost strategies and
replica number without prediction
When neglecting the replicas creation cost and
basic replica cost and emphasizing the penalty cost
resulting from the deviation of the predication replica
number from what the user really demand, we set the
parameter in our model Cr=0. In the figure 4, we do
not take into basic replica cost by setting Cr=0 and
compare average user requests number of replicas, the
prediction number of replicas under different cost
strategies by our system and that without prediction [3]
during every time slot. From figure 4, we can see our
system prediction mechanics must spend a little time to
study the log and reach a stable statue to math the user
demands flowing service provider strategies.
Figure 5 Average replica numbers in a time
slot when the weight of “Cr” increasing
The ration setting (C:C1 =1) makes the system try
its best to follow the real user demand on the replica
number to achieve the lowest cost; the ration setting
(C:C1 >1) indicates our system would rather predict
and provide more replica than user real needs, because
our system much more cares for the user satisfactory
on the service availability; the ration setting (C:C1 <1),
on the contrary, indicates our system tries its best to
decrease penalty from wasting resource to generate
and manage unused replicas in our system.
The result in figure 4 shows that the number of
replica under any of cost strategies is consistent with
the strategies of service/data provider and the replica
number is nearer to the real user demand than what the
system can supply without prediction mechanics.
When the basic replica cost can not be neglected, in
figure 5 we compare the predicted average number of
replicas in stable state at different value of “Cr”, The
upper sub-figure presents predicted replica number
without redundancy for fault tolerance; the lower one
presents predicted replica number with fault tolerance.
Meanings of the line notes are the same to those in
Figure 4. The average replica number will decline to
control total system cost, when basic replica cost can
not be neglected, but they still more approach real
replica number users need.
5.4. The system cost under different cost
strategies
The figure 6 represents the average cost under
different replica strategies. The three numbers and
their ratios represent different cost strategies from the
service provider in our system. The upper sub-figure 6
presents average system cost in every time slot when
there are replicas for fault tolerance in our P2P system;
the lower one shows system cost without more replicas
as redundancy for fault tolerance.
Figure 6. System cost comparison in every
time slot
In each sub figure 6, we compare the average total
cost under our cost model to that without prediction
mechanics during every time slot. The decline trends
of the cost indicate our cost model and prediction
mechanics can decrease total cost by control the
number of replicas. Under the same cost strategy, the
cost including system overheads and user complaints
in our P2P system is better than that without prediction.
5.5. The availability of replicas in our
prediction
The QoS of availability in our system is better than
that in the system only focusing on the system
dependability. The figure 7 exhibits the QoS of
availability under different cost configurations and the
comparison of availability QoS to that in the system
without prediction mechanics [3]. Because our system
and cost model can control the number of replicas to
fulfill the requests from users and it also has a fault
tolerance mechanic, the total QoS of availability is far
better than the systems, which only cares for the
undependability in P2P system.
The upper subfigure 7 compares the availability of
one content or service from viewpoint of system
dependability. When our system prefers to decrease
Figure 7. Replica availability comparisons in
every time slot
system replica cost, the peer dependability is not as
good as the system that dedicates to the system
dependability [3]. But under the definition on
availability QoS in our paper, the availability for the
user requests in our system is far better than that of the
system, which just emphasizes system dependability.
The result is shown in lower sub figure 7.
6. Future works
Our research combines the prediction of replica
number and the system dependability to guarantee the
replica availability QoS for the users in P2P system;
therefore the statements and results given above are
only from the static state of stable distribution and do
not exhibit the dynamic state transform in the complex
and dynamic P2P environment. We will further our
research in this problem.
Our system is based on the random placement
algorithm [7][8], and It's noted that the random
placement of replicas will degrade the availability QoS.
Our future work will adopt some heuristic strategies to
placement replicas to decrease network cost.
7. Conclusions
In this paper, we have discussed the multireplication problems in the P2P environment and focus
on the decision of the number of replicas to balance the
cost of service providers and the availability QoS to
users. Compared with the other researches on
replication which pay more attention to increase
system dependability, we redefine that the availability
QoS refers to not only the system dependability but
also its capability to meet the user requests in the P2P
environment. We formulae the replica prediction and
fault tolerance capability from the perspective of the
service providers under an optimal cost model. We
construct a dynamic multi-replica P2P system to
provide availability QoS to the users at a relative low
cost with flexible cost strategies to be easily
reconfigured.
The conclusion is drawn that multi-replication in our
P2P system with the capability of access mode
prediction and service fault tolerance can supply better
availability QoS to users than others who only focus
on system dependability.
8. References
[1] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, “Search
and replication in unstructured peer-to-peer networks”, Proc.
of the 16th annual ACM International Conf. on
Supercomputing (ICS’02), New York, June 2002.
[2] D. Milojicic, et al. Peer to Peer technology HP Labs
Technical Report, HPL-2002-57, http://www.hpl.hp.com/
tehcreports/2002/HPL-2002-57.html, 2002
[3] K. Ranganathan, A. Iamnitechi, I. Forster, “Improving
Data Availability through Dynamic model–Driven
Replication in Large P2P Communities”, Proc. of the 2nd
IEEE/ACM international Symposium on Cluster Computing
and Grid (CCGRID’02), 2002
[4]K. Ranganathan, and I. Forster, “Design and Evaluation of
Dynamic Replication Strategies for a High-Performance
Data Grid”, Proc. of the International Grid Computing
Workshop, Denver, 2001, pp.75-86
[5]A. Ripeanu, I. Forster, “A Decentralized, adaptive replica
location mechanism”, Proc. of 11th IEEE International
Symposium on High Performance Distributed Computing
(HPDC-11), July 24-26, 2002 Edinburgh, Scotland
[6]Ratnasamy S. et al. “A scalable content-addressable
network”. In Proc. ACM SIGCOMM2001, Pittsburgh, PA,
USA, August 27-31, 2001
[7]M. Karlsson and C. Karamanolis. “Bounds on the
Replication Cost for QoS”. Technical report, Hewlett
Packard Labs, July 2003
[8]G. On, J. Schmitt, R. Steinmetz. “Quality of Availability:
Replica Placement for Widely distributed Systems”. IWQoS
2003, 11th International Workshop, Berkeley, CA, USA,
June 2-4, 2003, Proceedings. Lecture Notes in Computer
Science 2707 Springer 2003
[9]G. On, J. Schmitt, R. Steinmetz. “QoS-Controlled
dynamic replication in P2P Systems”, Proc. of Third
International Conference on Peer-to-Peer Computing, 2003,
Linköping, Schweden
© Copyright 2026 Paperzz