replica

Effective Replica Allocation
in Ad Hoc Networks for
Improving Data Accessibility
Takahiro Hara
(Proc. IEEE Infocom 2001,pp1568-1576)
Presented by Mingsheng Peng
CS401 presentation
1
Contents
 Why Data Replication
 Related Work
 System Model
 Replica Allocation Methods
 Simulation Model
 Conclusion
CS401 presentation
2
Why data Replication?
 Ad hoc networks : temporarily constructed by only
mobile hosts.
 Mobile host plays the role of a router, even if source
and destination are not in range, data packets are
forwarded by relaying
 Since hosts move freely, disconnections occur
frequently, this causes frequent network division.
CS401 presentation
3
Why data Replication?
(contd...)
 Example, if some link goes
down and the network is split

Nodes on the right cannot access D2

Nodes on the left cannot access D1
CS401 presentation
4
Why data Replication?
(contd...)
 A possible solution is by replicating data
items at mobile hosts which are not the
owners of the original data.
CS401 presentation
5
Related Work

Ad hoc network routing protocol: Such as DSDV,AODV,DSR,ZRP,CBRP
 can only improve the connectivity among MHs which are connected to
each other,
 but cannot do anything when the network is divided as in the case in
Figure 1.

Distributed database systems
 data Replication in database helps in reducing response time
 since failures occur infrequently, a small number of replicas is sufficient

Mobile computing

mobile hosts access databases at sites in a fixed network, create replicas
on mobile hosts

address issues of maintaining consistency with low communication costs

assume only one-hop wireless communication
CS401 presentation
6
System Model

The system environment is assumed to be an Ad-hoc network where:




mobile hosts access data items held by other mobile hosts (single or multiple hops)
each mobile host creates replicas of the data and maintains the replicas in its memory
data item available if it is present locally or if it is available at one of the neighbors
Assumptions:






unique host identifier: Mj (set of all mobile hosts M = {M1, M2,…, Mm})
unique data identifier: Dj (set of all data items D = {D1, D2,…, Dm})
Assume all data items are of the same size
each host has a memory space of C data items for replicas (excluding the space for holding
originals)
data remains the same and does not change (simplifying assumption)
access frequencies of data items from each mobile host is known and does not change
CS401 presentation
7
Replica Allocation Methods
 Approach:
 replicas are relocated in a specific period
(relocation period)
 replica allocation is determined based on the
access frequency and network topology
CS401 presentation
8
Three replica allocation methods

Three replica allocation methods: differ in emphasis put on access
frequency and network topology.



SAF : (Static Access Frequency) only the access frequency to each
data item is taken into account.
DAFN : (Dynamic Access Frequency and Neighborhood) The
access frequency to each data item and the neighborhood among
mobile hosts are taken into account.
DCG : (Dynamic Connectivity based Grouping) The access
frequency to each data item and the whole network topology are
taken into account.
CS401 presentation
9
SAF(static Access Frequency)


Each host creates replicas in descending order of access frequencies
Advantages:




No control information regarding replicas need to be exchanged
Once each host has its all necessary replicas, there is no more
replica relocation.
Low overhead and low traffic
Disadvantages:

hosts with similar access characteristics have the same replicas,
but a MH can access data items held by other connected MHs,and
it is more effective to share many kinds of replicas among
them.Thus it gives low data accessibility when many hosts have
the same or similar access characteristics.
CS401 presentation
10
SAF example
CS401 presentation
11
DAFN
(Dynamic Access Frequency and Neighborhood)




The algorithm of this method is as follows:
1) At a relocation period, each mobile host broadcasts its host identifier and information
on access frequencies to data items. After all mobile hosts complete the broadcasts, from
the received host identifiers, every host shall know its connected mobile hosts.
2) Each mobile host preliminary determines the allocation of replicas based on the SAF
method.
3) In each set of mobile hosts which are connected to each other, the following
procedure is repeated in the order of the breadth first search from the mobile host with
the lowest suffix (i) of host identifier (Mi). When there is duplication of a data item
(original/replica) between two neighboring mobile hosts, and if one of them is the
original, the host which holds the replica changes it to another replica. If both of them
are replicas, the host whose access frequency value to the data item is lower than the
other one changes the replica to another replica. When changing the replica, among data
items whose replicas are not allocated at either of the two hosts, a new data item
replicated is selected where the access frequency value to this item is the highest among
the possible items.
CS401 presentation
12
DAFN
(Dynamic Access Frequency and Neighborhood)
 Eliminates replica duplication among neighboring hosts




The above procedure is executed every relocation period
Overhead and traffic is much higher than SAF
Does not completely eliminate replica duplication
If network topology changes during the execution of this
method, replica relocation cannot completed
CS401 presentation
13
DAFN example
CS401 presentation
14
DCG
(Dynamic Connectivity based Grouping)
Biconnected component: A maximum partial subgraph which is still connected if one of the
vertices is removed (high stability!)
 The algorithm is as follows:
1) At a relocation period, each mobile host broadcasts its host identifier and information
on access frequencies to data items. After all mobile hosts complete the broadcasts, from the
received host identifiers, every host knows the connected mobile hosts.
2) In each set of mobile hosts which are connected to each other, from the mobile host
with the lowest suffix (i) of host identifier (Mi), an algorithm to find biconnected
components is executed. Then, each biconnected component is put to a group. If a mobile
host belongs to more than one biconnected component, i.e., the host is an articulation point,
it belongs to only one group in which the corresponding biconnected component is first
found in executing the algorithm.
3) In each group, an access frequency of the group to each data item is calculated as a
summation of access frequencies of mobile hosts in the group to the item. The calculation is
done by the mobile host with the lowest suffix (i) of host identifier (Mi) in the group.
CS401 presentation
15
DCG
(Dynamic Connectivity based Grouping)
4) In the order of the access frequencies of the group, replicas of data items are
allocated until memory space of all mobile hosts in the group becomes full. Here,
replicas of data items which are held as originals by mobile hosts in the group are
not allocated. Each replica is allocated at a mobile host whose access frequency to
the data item is the highest among hosts that have free memory space to create it.
5) After allocating replicas of all kinds of data items, if there is still free
memory space at mobile hosts in the group, replicas are allocated in the order of
access frequency until the memory space is full. Each replica is allocated at a
mobile host whose frequency to the data item is the highest among hosts that have
free memory space to create it and do not hold the replica or its original. If there is
no such mobile host, the replica is not allocated.
CS401 presentation
16
DCG
(Dynamic Connectivity based Grouping)
 Data accessibility is expected to be higher since replicas
are shared among a group of hosts

Overhead and traffic higher than the other two methods
since it consists more steps and needs to take the largest
time among the three methods to relocate replicas in a
wide range.
 the probability is higher that the network topology changes
during executing this method, and in this case, the replica
relocation cannot be done at mobile hosts over
disconnected links
CS401 presentation
17
DCG example
CS401 presentation
18
Simulation Model







50  50 flatland
Each host randomly moves in all directions
Movement speed is randomly determined between 0 to d
Radio communication range is a circle of radius R (1-19) fixed to 7
Number of hosts = Number of data items = 40
Each host has creates up to C replicas (1-39) fixed to 10
Access frequency of each host to Di is pi given by one of the three cases:
 Case 1: pi = 0.5(1 + 0.01i)





Each host has same access characteristics, access frequencies vary in a Small range
Case 2: pi = 0.025i

Each host has same access characteristics, access frequencies vary in a Wide range
Case 3: pi is determined as a positive value based on N( 0.5(1+0.01i) , σ )
 larger the value of σ, higher the difference in the access characteristics of the hosts
Relocation period = T (1-8192) fixed to 256
Simulated for 59,000 time units and traffic measured (traffic = number of hops used for
relocating data)
CS401 presentation
19
Conclusion




Introduced replica allocation in ad-hoc networks as a mechanism of
improving data accessibility
Proposed 3 replica allocation methods that use access patterns and the
network topology
Simulation results show that DCG gives the highest accessibility at the
cost of increased traffic and SAF has the least traffic with low data
accessibility
The replica allocation method depends on the system configuration
and access patterns
CS401 presentation
20
Any Questions?
Thank you!
CS401 presentation
21