Data Allocation and Dynamic Load Balancing for Distributed Video

Journal of Visual Communication and Image Representation 10, 197–218 (1999)
Article ID jvci.1999.0420, available online at http://www.idealibrary.com on
Data Allocation and Dynamic Load Balancing
for Distributed Video Storage Server∗
Shiao-Li Tsao,†, ‡ Meng Chang Chen,†,1 Ming-Tat Ko,†
Jan-Ming Ho,† and Yueh-Min Huang‡
†Institute of Information Science, Academia Sinica, Taiwan; ‡Department
of Engineering Science, National Cheng Kung University, Taiwan
E-mail: [email protected]
Received January 20, 1998; accepted March 8, 1999
In this paper, a novel initial videos allocation scheme and a dynamic load balancing
strategy are proposed for a distributed video storage server in order to increase the
availability and reduce operation cost. The initial allocation scheme determines the
allocation of video replicas on the servers to achieve static load balance and to
obtain a configuration for efficient dynamic load adjustment. From the simulation
results, the proposed load shifting algorithm can reduce up to 50% request fail rate if
compared with the same initialization algorithm without load shifting. The proposed
initial allocation with load shifting also reduces 25% to 60% request fail rate from
the least load first initial allocation scheme with load shifting, 5% to 10% request fail
rate and 5% to 25% the number of shifting steps from the DASD dancing method.
Moreover, a prototype is implemented on Windows NT to examine the correctness
and practicability of the proposed schemes. °C 1999 Academic Press
1. INTRODUCTION
Recent advances in computing and communication technologies enable the distributed
multimedia applications that video storage servers play an important role. One of the important issues of designing video servers is the scalability problem [1–3]. Of late, the distributed
video (storage) servers composed of low-end computers were proposed to solve the issue
[4–8]. Even though the video servers as a whole have sufficient computing power to serve all
the current user requests, some user requests may fail since the computers with the desired
videos do not have the available computing power to handle the requests. As the low-end
computers have limited storage space and computing power, it is extremely important to
properly allocate the video files and balance the user requests among them. A statistical
result of long-term observation of user behaviors blended with domain expertise, called
∗ This research is partially supported by NSC under Grant NSC86-2213-E-001-022. This article was originally
part of the Special Section on Multimedia Storage and Archiving Systems, which appeared in the Journal of Visual
Communication and Image Representation, Vol. 9, No. 4, December 1998.
1 Corresponding author.
197
1047-3203/99$30.00
C 1999 by Academic Press
Copyright °
All rights of reproduction in any form reserved.
198
TSAO ET AL.
expected request pattern, is used to guide the video file allocation process. However, as the
user requests of videos are dynamic in nature, the actual request pattern is skewed from the
expected request pattern [9–11] that results in the increase of request failures.
One possible data allocation solution is to stripe every video file over all servers [12, 13].
The advantage of this approach is that it can achieve maximum load balancing, since user
requests can be served by any available server. However, the system is not reliable, since
the failure of one server will result in the failure of whole system. System synchronization
overhead and a complex system control mechanism are also its drawbacks. Alternatively,
we can replicate videos to several servers so that the requests can be migrated to other
servers during server failure or for load balance purposees [14, 3]. The cost of this approach
is extra disk space [7]. The fundamental policies of data replication had been explored in the
previous study [11]. Serpanos et al. [15] proposed a data replication scheme for distributed
multimedia servers to achieve both load and storage balance. With the assumption of the
actual requests being identical to the expected load, they focused on the initial allocation
of video replicas to achieve static load balance. However, the actual requests may not be
identical to the expected demands due to two reasons. First, the expected load is calculated
and forecasted according to long-term statistical data that may not represent the short-term
request pattern. Second, the access probabilities of some videos may not be accurately
predicted (e.g., the actor in the movie won an award yesterday). The discrepancy of actual
and expected request patterns induces request failures.
Wolf et al. [16] proposed the DASD dancing algorithm to balance the user requests
of a multidisk server. They modeled disks as nodes of a graph, and a pair of the same
video replicas on two disks as an edge between the nodes. The initial allocation algorithm
tried to reduce the diameter of the graph. Then, they proposed a on-line load balance
scheme on hard disks, called DASD dancing, to migrate progressing requests among disks.
However, the DASD dancing scheme assumes each replica of a video has the same access
probability to simplify the problem, so that the graph they constructed is undirected and
unweighted. Moreover, it does not consider reducing the number of shifting steps (i.e.
request migrations) when performing dancing, as the cost of migration is negligible in the
single server environment. In a distributed environment, each migration of a user request
from one server to another server pays the cost of control messages passing, admission
control exercising, and job rescheduling. The cost varies from environment to environment,
and sometimes it is too large to be ignored. In this paper, we elaborate a novel initial data
allocation algorithm to obtain a high connectivity between servers, together with a dynamic
load balancing strategy called load shifting to efficiently migrate the progressing requests
in a distributed environment. The major differences between the DASD dancing [16] and
our approach are that our approach considers the access probability of video replicas to
optimize the connectivity between servers and our approach also reduces the shifting steps.
We discuss the implementation issues, and prototype our scheme on Windows NT to prove
its practicability.
The rest of the paper is organized as follows. The architecture of a distributed video
server, the basic concept of load shifting procedure, and notations in the paper are described
in Section 2. Then, the initial data allocation and reallocation algorithms are proposed
and discussed in Section 3. The load shifting algorithm is in Section 4. In Section 5, the
simulation results and implementation issues are discussed. Finally, we conclude the paper
in Section 6.
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
199
FIG. 1. The environment of a video service system.
2. BASIC CONCEPTS OF LOAD SHIFTING
2.1. System Architecture
The environment of a distributed video service system is depicted in Fig. 1. The elements
of the system include tertiary storage servers, disk-based video storage server clusters, a
backbone network (e.g. ATM), an access network (e.g. HFC), and end-user equipment. All
of the video data are stored in the tertiary storage servers, and hot videos are cached in
the disk-based video server clusters. In this paper, we focus on the disk-based video server
cluster. We apply file replication among servers within a cluster. A video file is a basic unit
of replication and may be striped to all the disks within a video server. In other words, a
video file is never striped across a network. A cluster of servers shares a single name, called
cluster name. The cluster name can be an IP address or a domain name. Clients request
video services by referring to the unique cluster name. The request is received by all servers,
but only one server within the cluster will respond to the request. A progressing request
may be migrated to other servers within the same cluster dynamically. In this research,
we extend the approach of OneIP proposed by Damani et al. [17], which is a technique to
share a single IP address with servers within a cluster. We will describe OneIP in detail and
present its extension to support our scheme in Section 5.
Before giving a formal description of the problem and our solution, we first illustrate
a complete processing flow of a user request in the distributed video service system. A
user first requests the video service by using the cluster name. As we described above,
200
TSAO ET AL.
the request to a cluster will be received by all servers. Only one server within the cluster
will handle this request. In order to achieve load balancing, progressing requests may
be migrated to other servers within the same cluster. As the migration is transparent to
the user, the user uses the same cluster name during the entire service. Occasionally, the
hand-off procedure of a migrated request may not be executed smoothly due to the timing
difference between the two servers; a certain amount of buffer in the client side is required.
Lee studied the synchronization problems between multiple servers delivering data to a
client and summarized the architectures into three approaches [18], i.e. proxy-at-server,
independent proxy [19], and proxy-at-client. Our design for the synchronization problem
is basically one type of proxy-at-client solution.
2.2. Concepts of Load Shifting
In a video storage server cluster, when a request on a particular video arrives and no
server with the desired video within a cluster has sufficient computing power to serve this
request, the request is blocked. A blocked request can only be served when one of the servers
with the desired video within the cluster reclaims sufficient resources previously allocated
to other requests. The basic concept of load shifting scheme is that a blocked request on a
server can be admitted by shifting a progressing request on the server to other servers within
the same cluster, and the shifted request can be admitted in the other server immediately or
after applying one or more shifts. Note that shift operations can act as a chain reaction until
all the shifted requests are admitted. In Fig. 2, the cluster consists of four video servers; each
server can store two videos and serve up to four requests simultaneously. Assume three of
the four video servers, are fully utilized, except Server 4. Meanwhile, a request for Video 1
arrives. Video 1 is only stored on Server 1, but Server 1 has no resource to serve the request.
Then Server 1 migrates a request for Video 2 to Server 2, Server 2 migrates a request for
Video 3 to Server 3, and Server 3 migrates a request for Video 4 to Server 4 subsequently.
Finally, the new request for Video 1 can be admitted without dropping any request.
FIG. 2. Video i/j means j requests on video i.
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
201
Before describing our proposed initial allocation strategy and shifting scheme, we fix
some notations. Let S and M denote the number of servers and the number of videos in
the system. For a video server cluster, all the video files are stored in the tertiary storage
servers, and a limited number of videos are cached in the video servers within a cluster.
We assume that the servers are identical with the disk storage capacity of C videos, and the
computing capability to serve a maximum number of Amax users. For a large-scale system,
M is usually much larger than C · S, i.e. M À C · S, and only the first K hottest videos are
cached in the cluster. In order to improve the availability of a video server cluster, some
videos may be replicated several times and stored on different servers. We define Vi, j as
½
Vi, j =
1, if a copy of video i stored on server j;
0, otherwise.
Pi, j denotes the number of progressing streams of video i on server j with the followPM
Pi, j ≤ Amax , and if
ing properties: for any pair of video i and server j, we have i=1
PM
Pi, j ≥ 1, Vi, j = 1 must be true. For the first property, i=1 Pi, j ≤ Amax , it means that the
number of serving requests on any server must be less than its total service capacity. For
the second property, the progressing streams of video i on server j is larger than one only
if server j has video i.
When a new request arrives at the cluster, if all the servers with the desired video are not
available, the load-shifting procedure will try to adjust the load of servers to accommodate
the new request. For example, a new request of video Vreq arrives at a system while no
server with video Vreq is available; the request is blocked. Meanwhile, the shifting algorithm
proceeds to find a feasible shifting path in order to admit the request. A request is called
failed if it is blocked and no feasible shifting path can be found. We define a shifting path
for a new request on video Vreq as
S P(Vreq ) =
©¡
¢ ¡
¢ ¡
¢
¡
¢ª
NULL, Vreq , Si1 , Si1 , V j1 , Si2 , Si2 , V j2 , Si3 , . . . , Sid−1 , V jd−1 , Sid ,
where the triple (Si , V j , Si 0 ) is a shifting step that shows a progressing stream of video V j
on server Si migrated to server Si 0 . The first vector (NULL, Vreq , Si1 ) in the shifting path
shows the new request is assigned to server Si1 .
For a feasible shifting path, it satisfies the following properties:
• For every tuple (Si , V j , Si 0 ) in a shifting path, V j,i = 1, P j,i ≥ 1, and V j,i 0 = 1. This
property means that for any shifting step, both the server migrating out a request and the
server taking over the request have the desired video of the migrated request.
PM
•
i=1 Pi,Sid < Amax . That is, the number of progressing streams on the last server in
the shifting path must be less than the maximum number of streams supported by a server.
• The number of serving streams on the servers, excluding the last one in the shifting
path, are the same after the shifting procedure is performed.
3. INITIAL ALLOCATION STRATEGY
To increase the service availability of a distributed video storage server, the first issue
is to explore an initial data allocation algorithm to achieve the static load balance among
these servers and to improve the possibility of finding feasible shifting paths for blocked
requests. The next issue is to develop an efficient load shifting algorithm to reduce the
202
TSAO ET AL.
shifting steps. Suppose each video, say video Vi , has an expected access probability Pri .
We want to determine the number of replicas for each video under the given expected access
probabilities of the K cached videos and the storage capacity of the video server cluster,
i.e. C · S. It is intuitive that the number of replicas of a video should be proportional to their
access probabilities since the video with more replicas on different servers means higher
availability. Since storing more than one copy of a video cannot improve its availability,
an obvious constraint is that the maximal number of replicas of a video must be less
than the number of video servers, i.e. S. Another constraint is the storage limitation of
servers. The number of replicas of videos are integers so that the apportionment of the
amount of C · S storage space to K cached videos by their access probabilities is an integer
resource allocation problem [20]. Several fair integer resource allocation algorithms were
investigated in [20, 21], which determine the number of replicas of each video under the
above constraints. Roughly, the number of replicas of video Vi , Ri , can be calculated as
Pri
· C · S.
Ri = P K
j=1 Pr j
For the details of the resource allocation algorithms, readers can refer to [20].
After deciding the number of replicas of every video, we further allocate the access
j
probability of the video to its replicas. Let Pri denote the access probability of the jth
copy of the ith video. The summation of the access probabilities of all the replicas equals
P i
j
Pri . The initial data allocation can be
the access probability of the video; i.e. Pri = Rj=1
represented as a directed graph, called the connection graph. The server is a node in the
graph, and an edge (or a connection) between two servers indicates that there is a video
whose replicas are stored on both of the servers. Each edge has a weight which is defined as
the access probability of the replica stored in the out-going node of the edge. In other words,
the weight of the edge means the probability that a request will be shifted from the out-going
node to the in-coming node. The edges are always paired that each may have a different
weight if the replicas have different access probabilities. For example, the connection graph
in Fig. 3 shows the access probability allocation of videos in Table 1.
A blocked request can be admitted only if there are feasible shifting paths from the
designated server. To increase the possibility of finding feasible shifting paths for a blocked
request, it is favorable that the designated server has many out-going edges with high access
probabilities, as well as their neighbors. We define the connectivity of a server as the total
access probabilities of its video replicas. I.e. the connectivity Cl of the lth server is defined
P M PS
n(i,l)
, where n(i, l) is a mapping function which denotes
as Cl = i=1
k=1 Vi,l · Vi,k · Pri
the n(i, l)th replica of the ith video stored in the lth server. As the total connectivity of all
j
the servers is fixed for a given {Ri , 1 ≤ i ≤ M} and {Pri , 1 ≤ i ≤ M and 1 ≤ j ≤ Ri }, it is
TABLE 1
The Access Probabilities of Video Replicas
Video 1
Video 2
Video 3
Video 4
Server #1
Server #2
0.03
0.01
0.03
0.01
Server #3
0.01
0.03
Server #4
Total access probability
0.01
0.03
0.04
0.04
0.04
0.04
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
203
FIG. 3. The connection graph for the initial allocation shown in Table 1.
desirable to balance the connectivity of the servers. Then, we can define that the objective
function is to maximize the production, C1 · C2 · C3 · · · · · C S , since C1 · C2 · C3 · · · · · C S
is maximal when C1 ≈ C2 ≈ C3 ≈ · · · ≈ C S .
Once we have the number of replicas of each video determined by an apportionment
algorithm [20] and the access probability of each replica obtained by an access probability allocation algorithm, we can allocate these replicas to servers. The access probability
allocation algorithm is to apportion the access probability of one video to its replicas.
Several access probability allocation algorithms were proposed, such as the uniform apporj
tionment of access probability on all replicas, i.e. Pri = Pri /Ri by [16], or the nonuniform
apportionment algorithm proposed by [15]. Our initial allocation algorithm can obtain a
balanced connectivity of a video storage server cluster under any given access probabilities
obtained by any apportionment algorithm. Our proposed initial allocation strategy allocates the video from the hottest to the collest. For a video Vi with Ri copies, we allocate
the Ri copies to Ri distinct servers, which maximizes C1 · C2 · C3 · · · · · C S at the mon(i, j)
access probability is allocated to server j, the
ment. After the video replica with Pri
n(i, j)
server gains Pri
access probability, and the connectivity of the server is increased by
n(i, j)
n(i, j)
· Ri − 1, since the server will create Ri − 1 edges, each with Pri
weight to the
Pri
other Ri − 1 servers. If more than one set of Ri servers maximizing C1 · C2 · C3 · · · · · C S
are found, the set of Ri servers with the lowest total access probability is selected. For the
videos with only one copy, they do not contribute to the connectivity and are assigned to the
server with minimal access probability. That is, the videos with only one copy are allocated
to the server in a least-load-first manner. Figure 4 shows the proposed initial allocation
algorithm.
To support the dynamic changes of the system, including the insertion of new servers and
files and changes of the access probabilities of videos, we propose two video reallocation
policies. The first one is called partial reallocation policy, which requires less updates on
204
TSAO ET AL.
FIG. 4. The proposed initial allocation algorithm.
the servers but it cannot guarantee the optimal connectivity between servers. The other one
is called total reallocation policy, which reallocates all videos on the servers to obtain the
optimal connectivity. The total reallocation policy performs the initial allocation algorithm
presented in the previous section. While it can obtain the optimal file allocation, total
reallocation policy introduces recomputation overhead of the initial allocation algorithm,
as well as expensive file copies within the cluster. On the other hand, the partial reallocation
policy takes the original video locations into account and only considers the newly inserted
or changed videos. The partial reallocation policy first determines the number of replicas of
each file according to the storage space and new access probability. If the number of replicas
of a video changes, these replicas are called undetermined replicas that their locations are
yet to be determined. The set of undetermined replicas is called as undetermined set. Then,
the undetermined replicas will be first sorted by their access probabilities and the policy runs
our proposed initial allocation algorithm for determining their new locations. The allocation
algorithm may not always find a feasible solution, e.g. a file with five replicas when there
are only four different servers with available storage space. In this situation, the policy will
remove the video with the least access probability from the servers, insert all its replicas
in the undetermined set and resume the initial allocation algorithm. The flow chart of the
partial reallocation algorithm is depicted in Fig. 5. It is clear that the result is not optimal.
205
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
FIG. 5. Flowchart of the partial reallocation policy.
Table 2 shows an example. Originally, we have three servers and three files. Suppose we
insert a new file, save Video 4, and a new server, say server D, into the cluster. The access
probabilities are also changed. We have the new file locations by applying partial reallocation
and total replication policies in Table 2. The location obtained by the partial reallocation
policy introduces no video copy on the original servers, but it obtains a connectivity value,
TABLE 2
An Example of File Locations by Applying Different Reallocation Policies
Server A
(storage
space = 2)
Video 1
Video 2
Video 3
Video 4 (new video)
×
×
Video 1
Video 2
Video 3
Video 4 (new video)
×
×
Video 1
Video 2
Video 3
Video 4 (new video)
×
×
Server B
(storage
space = 2)
Server C
(storage
space = 2)
Server D
(storage
space = 2)
(new server)
Total access
probability
Number of
replicas
Original file allocation
×
×
×
×
0.5
0.3
0.2
3
2
1
New file allocation by partial reallocation policy
×
×
×
×
×
×
0.4
0.3
0.2
0.1
3
2
2
1
New file allocation by total reallocation policy
×
×
×
×
×
×
0.4
0.3
0.2
0.1
3
2
2
1
206
TSAO ET AL.
6.65 × 10−3 . On the other hand, the total reallocation policy copies two files on the original
servers and obtains a higher connectivity value, 9.59 × 10−3 . In order not to interrupt the
video services, we need to allocate the unused system resources to perform the reallocation.
Techniques to perform the reallocation job on-line can be found in [22] that both total and
partial reallocation policies can be applied to a system in operation.
4. LOAD SHIFTING SCHEME
The goal of load shifting is to make room to accomodate the difference between actual
access probabilities and expected access probabilities of videos. When a new request arrives,
it is assigned to one of the servers with the desired video, following the access probabilities
of the replicas. For example, the request of video 1 in Table 1 has probability of 0.75 to
be assigned to Server 1 and 0.25 to Server 2. If the access probabilities of all the replicas
are equal, the least loaded server with the desired video is selected. In the proposed load
shifting algorithm, the shifting procedure can be activated in the following two conditions;
for the new request of a video, S j is the selected server:
PM
PM
• when
i=1 Pi, j − Min{ i=1 Pi, j 0 } ≥ T1 , where S j 0 is a neighbor of S j , or
PM
•
i=1 Pi, j ≥ Amax − T2 .
T1 and T2 are predefined thresholds. The first condition is to compare the load of the selected
server with its neighbors to decide if load shifting is needed, while the second condition is
only checking the load of the selected server against the threshold value. Both checks can
be performed as a background or a real-time job. The system administrator can set values
of the two thresholds to obtain the desired operating policy. For instance, if T1 = ∞ and
T2 = 0, the load shifting process is performed in real time when the new request is blocked.
If 0 ≤ T1 , T2 ≤ Amax , the load shifting process can be implemented as a background job to
reduce request delays.
The shifting procedure finds feasible shifting paths from the assigned server servers with
system load less than Amax . In order to reduce the shifting steps to accomodate the requests,
the shortest feasible shifting path is selected. Since there may be more than one shortest
feasible shifting path, some criteria need to be added to distinguish them. As mentioned in
the previous section, the initial allocation is to maximize the connectivity C1 ·C2 ·C3 · · · · ·C S ,
based on the expected access probabilities. For the load shifting, the run-time connectivity
denoted as Cicur for the server Si , is applied to select the shifting path from the shortest
feasible shifting paths which maximize the product C1cur · C2cur · C3cur · · · · · C Scur .
Since we model the allocation of video replicas on servers as a directed graph, it is obvious
that breadth-first search can be used to find the optimal shifting path. Figure 6 shows the
proposed load-shifting algorithm.
5. SIMULATION RESULTS AND IMPLEMENTATION ISSUES
5.1. Simulations
In this section, we compare the performance of the proposed approach and others via simulation. The user request is assumed to be Poisson process with 0.83 request/minute arrival
rate. The simulation environment of the video server cluster consists of 10 identical video
servers that each server can support up to ten 6-Mbps MPEG-2 streams simultaneously.
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
207
FIG. 6. The proposed load-shifting algorithm.
The video file length is assumed 5.4 GB for 2 h playback. The top 100 popular videos are
cached in the disks that at least one copy of the videos can be found on one of the servers.
The number of replicas of each video is determined by applying the Hamilton method which
is one of the frequently used integer resource allocation algorithms [20], and we also adopt
the uniform apportionment to allocate the access probabilities of videos to the replicas. The
distribution of requests on videos are modeled as Zipf’s distribution [23], which is defined
as
Pi =
c
i
,
(1−θ )
,
where c = 1
N
X
1
i 1−θ ,
i=1
i ∈ {1, 2, . . . , N }. Figure 7 shows the access probabilities of videos, based on Zipf’s distributions with various θ values.
In the first simulation, we allocate 64.8 GB disks for each server. In total, 120 copies of
video files can be stored. I.e., 20 copies are for replicas. We forecast the expected access
distribution to be Zipf’s distribution with θ = 0.1, while we let the actual access probabilities
range from θ = 0.1 to θ = 0.9 of Zipf’s distribution. Note that the initial video allocation is
based on the expected access distribution. We examine four different initial data allocation
and load balancing approaches, including the least-load-first initial allocation without load
shifting (LLF without LS), the least-load-first with load shifting (LLF with LS), the DASD
dancing method with dancing threshold ∞ (DSAD), and the proposed initial allocation
algorithm with load shifting (CO with LS). The least-load-first (LLF) initial data allocation
strategy sorts the videos in descending order by the access probabilities and allocates the
video replica in the order to the server with the minimal load. In order to have a fair
comparison of the number of shifting steps, we set the thresholds of the proposed approach
as T1 = ∞ and T2 = 0, and we also set the dancing threshold of the DASD dancing method
208
TSAO ET AL.
FIG. 7. The access probabilities of videos sorted by ranks.
to infinity, which implies that request dancing is only performed while the new request is
blocked.
Figure 8 depicts the request fail rate over the distribution of actual demands. From the
figure, we learn that the request fail rate increases with the increment of actual requests from
θ = 0.1 to θ = 0.9. The request fail rate is minimal when the actual demand distribution
fully matches to the forecasted, i.e. θ = 0.1 in this case. The figure also shows that LLF
FIG. 8. The request fail rate over the actual demands under the expected parameter (θ = 0.1).
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
209
FIG. 9. The number of shifting steps per 100 requests over the actual demand under the expected parameter
(θ = 0.1).
initial allocation with load shifting can reduce the request fail rate by 30% to 50% from
the LLF method without load shifting. The service capacity improvement of a video server
cluster by employing the load shifting procedure is quite significant. Comparing with other
approaches, our initial allocation scheme can further reduce 25% to 60% request fail rate
from the LLF method with load shifting and around 5% fail requests from the DASD
dancing. Our approach obtains the lowest request fail rate, since the definition of connectivity
in our approach considers both the number of video replicas and their access probabilities
that better balance the loads of the servers.
Figure 9 shows the number of shifting steps per 100 requests over the distribution of
actual demands with the expected parameter (θ = 0.1). The figure should be viewed with
caution, as the failed requests are counted in the total requests. It means lower shifting steps
may be due to better algorithms or higher fail rates. While the proposed approach reduces
25% to 60% request fail rate from LLF approach, it introduces 8% to 30% more shifting
steps. Comparing with the DASD approach, we can observe 10% to 25% the number of
shifting steps are reduced by our approach and still a 5% reduction of fail rate can be seen
from Fig. 8.
To have a clear view of the combined effects of shifting ability and system availability
derived from the initial data allocation schemes, we define a reward function to evaluate
their performance. We assume that the system earns R E a units of reward when accepting a
new request, while paying a penalty of R E p units for a shifting step. Therefore, the reward
function can be used as a basis of comparison of different initial data allocation approaches.
Figure 10 shows the outcomes of the reward function with R E a = 5 and R E p = 1 of different
initial allocation schemes under various actual request probabilities. From the figure, it can
be seen that our proposed method obtains 100% and 10% more reward units than the LLF
method and the DASD method with load shifting, respectively.
210
TSAO ET AL.
FIG. 10. The reward obtained by different initial allocation schemes over different distribution of actual
demands under the expected parameter (θ = 0.1).
In the next simulation, we modify the forecasted demand on videos to Zipf’s distribution
with θ = 0.9 and repeat the previous simulation with all other parameters unchanged. The
simulation results are shown in Figs. 11, 12, and 13, which are in parallel to Figs. 8, 9, and 10.
From Figs. 11, 12, and 13, it shows that our proposed approach has a better performance than
the others, just like in Figs. 8, 9, and 10. It can also be observed that the initial allocation
FIG. 11. The request fail rate over the actual demand under the expected parameter (θ = 0.9).
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
211
FIG. 12. The number of shifting steps per 100 requests over the actual demand under the expected parameter
(θ = 0.9).
of all approaches based on a near uniform distribution, such as Zipf’s distribution with
θ = 0.9 in this simulation, has lower connectivity which results in lower shifting capability
and higher fail rate when the actual demands in videos differ from the forecast.
In the next simulation, we explore the relation between the factor of the number of
replicas and system performance. We adopt the same simulation parameters and increase
FIG. 13. The reward obtained by different initial allocation schemes over different distribution of actual
demands under the expected parameter (θ = 0.9).
212
TSAO ET AL.
FIG. 14. The request fail rate over different percentage of replication.
the percentage of replication from 10% to 100%, i.e. 108 GB per server. We examine the
four situations: (1) the expected demand θ = 0.1 and the actual request θ = 0.1, denoted
as (EP = 0.1/P = 0.1); (2) the expected demand θ = 0.1 and the actual request θ = 0.9,
denoted as (EP = 0.1/P = 0.9); (3) the expected demand θ = 0.9 and the actual request
θ = 0.1, denoted as (EP = 0.9/P = 0.1); (4) the expected demand θ = 0.9 and the actual
request θ = 0.9, denoted as (EP = 0.9/P = 0.9). Figure 14 illustrates the request fail rate
over the percentage of replications of the DASD dancing approach and our approach. We
find that the request fail rate decreases by the increment of the percentage of replications, and
our proposed method obtains lower request fail rate than the DASD method from around 5%
under 10% replication to 10% under 100% replication. The reason is that, as the percentage
of replication increases, our approach can better allocate replicas to the servers. Note that
a higher percentage of replication means more edges in the connection graph and a more
sophisticated algorithm gains advantage. Figure 14 also shows that EP = 0.1/P = 0.1 and
EP = 0.1/P = 0.9 result in the best and poorest performances, respectively, among the four
situations. Figure 15 shows the number of shifting steps per 100 requests of the simulation
in Fig. 14. The shifting steps increase by the increment of replications due to the decrease
of fail rate.
In the last simulation, we examine the relation between the request fail rate and the
service capacity of a storage server, i.e. the number of streams supported by a server.
Figure 16 shows the results. We see that the request fail rate decreases by increasing the
number of supported streams of a server. For the two situations, EP = 0.1/P = 0.1 and
EP = 0.9/P = 0.1, the request fail rates reduce dramatically, since the skewed access pattern
can be alleviated by increasing the number of users in a system. Moreover, DASD and the
proposed method become closer for a higher number of served user in a system since the
requests on servers are naturally balanced.
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
213
FIG. 15. The number of shifting steps per 100 requests under different percentage of replication.
5.2. Implementation Issues
To realize the proposed load-shifting scheme, we adopt and extend the broadcast dispatching scheme of OneIP presented by Damani et al. [17]. The basic concept behind the
broadcast dispatching scheme of OneIP is that a cluster of servers publishes a cluster IP
FIG. 16. The request fail rate under different service capacity of a storage server.
214
TSAO ET AL.
FIG. 17. Apply OneIP technique to a video server cluster.
address to users. All requests to the cluster use the cluster IP as their destination, and only
one server within the cluster will respond for each request. Figure 17 demonstrates the
request/response packet flows when OneIP is applied to a video server cluster. First, we
assign a unique IP address and an ID to each server and a shared cluster IP to all servers
within a cluster. The request packets use the cluster IP as their destination IP address. The
destination medium access control (MAC) address of the request packet will be replaced
by the broadcasting MAC address. In this way, all servers in the cluster will receive the
request packets. The servers run an election algorithm to determine a server to handle the
request. For example, a fair election algorithm can be a MOD operation on the source IP
address of the request packet, and the result of MOD operation determines the ID of the
server which will handle the request. Later, the selected server responds to the request by
using the cluster IP address. If the source IP addresses of the incoming request packets are
uniformly distributed, the load of each server will be balanced. The details of OneIP can
be found in [17]. OneIP suggests using a simple server election algorithm such as MOD
operation in order to reduce computation complexity and the memory space of a dispatching
table.
We extend OneIP to support video server clusters and the load shifting operation by
maintaining a file location table and the load information of each server within the cluster.
While the request arrives, the servers determine a server to handle it by looking up the file
location table and load information. Suppose the server with ID = k is assigned to respond
to the request and the server will reply to the client with the handling server ID = k. From
then on, the following request packets for the same video service from the client will always
carry the handling server ID. The handling server ID can be inserted in the protocol, realtime stream protocol (RTSP) or DSM-CC packets. Hence, the servers can easily determine
if the incoming packets are assigned to themselves by checking the handling server ID in
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
215
the packet. Before migrating progressing requests between servers, the system will notify
clients to update their handling server IDs in the request packets.
While a new request is blocked, servers will run the load-shifting algorithm to identify a
shifting path. If a feasible shifting path can be found, the first server in the shifting path will
initiate the shifting operation by sending a hand-off packet to all the servers in the shifting
path. The hand-off packet contains the local time, the expected hand-off time point, all
videos to be shifted, and the servers in the shifting path. Once a server receives the hand-off
packet, it calculates the expected hand-off point of the video file that will be shifted from it,
and sends the information to its neighboring server in the path. After all the servers in the
path receive the hand-off packets, the new handling server ID is sent to the clients viewing
to-be-shifted videos. The next step is to perform hand-off operation when the expected
hand-off time is reached. Since the downstream is connectionless and the cluster IP address
is always the same, it is obvious that the hand-off procedure is transparent to clients. Time
distortion and timer drift on servers may incur difficulty to smooth migration. Little buffer
on the client site is necessary to accommodate the timing difference between the stop of
video transmission from the previous server and the start of transmission from the new
server. According to our experiments, if the hand-off time is set to 1 s and the service is
1.5 Mbps MPEG-1 video, around 100 KB buffer is sufficient.
We extend OneIP as an intermediate driver on Windows NT to support the operations
of a distributed file server. Figure 18 shows Windows NT network architecture and packet
flows of extended OneIP. The intermediate driver works between of network layer and
the MAC layer. When it receives request packets, the driver discards the request packets
with the wrong handling server ID. For the packets to the server, it forwards the packets
to the upper layer. The driver also performs the load-shifting procedure in order to reduce
operating system overhead and to improve delay latency. It sends and processes the hand-off
packets. According to the experimental results, the driver introduces less than 3% overhead
FIG. 18. Packet flows of the extended OneIP driver on Windows NT.
216
TSAO ET AL.
in sending and receiving load-shifting related packets, and requires around 1-s hand-off
time, which varies for different interconnection network of the servers.
6. CONCLUSIONS
In this paper, a novel initial allocation of videos for a distributed video storage server was
proposed. With the proposed load-shifting algorithm, the system can reduce the request fail
rate and shifting steps. According to the simulation results, we can reduce 50% the request
fail rate from that without the load-shifting procedure, around 25% to 60% the request fail
rate from that of the LLF initial allocation scheme with load shifting, and 5% to 10% the
request fail rate with 5% to 25% shifting steps from the DASD dancing method. Moreover,
a kernel level driver on Windows NT was prototyped to examine the practicability of the
proposed load-shifting scheme.
REFERENCES
1. C. Freedman and D. Dewitt, The spiffi scalable video-on-demand server, in ACM SIGMOD, 1995.
2. D. Pegler, N. Yeadon, D. Hutchison, and D. Shepherd, Incorporating scalability into networked multimedia
storage systems, in SPIE Conference on Multimedia Computing and Networking, 1997, pp. 118–134.
3. P. Lougher, R. Lougher, D. Shepherd, and D. Pegler, A scalable hierarchical video storage architecture, in
SPIE Conference on Multimedia Computing and Networking, 1996, pp. 18–29.
4. W. Tetzlaff, M. Kienzle, and D. Sitaram, A methodology for evaluation storage system in distributed and
hierarchical video servers, in Spring COMPCON ’94, 1994.
5. C. Federighi and L. A. Rowe, A distributed hierarchical storage manager for a video-on-demand system, in
Proceeding of Symp. On Elec. Imaging Sci. and Tech., Feb. 1994.
6. D. N. Serpanos and T. Bouloutas, Centralized vs Distributed Multimedia Servers, Technical Report RC 20411,
IBM Research Division, T. J. Watson Research Center, March 1996.
7. D. Pegler, D. Hutchison, and D. Shepherd, Scalability issues for a networked multimedia storage architecture,
in Proceedings of 1995 IEE Data Transmission, 1995.
8. W. J. Bolosky, J. S. Barrera, R. P. Draves, R. P. Fitzgerald, G. A. Gibson, M. B. Jones, S. P. Levi, N. P.
Myhrvold, and R. F. Rashid, The tiger video fileserver, in Proceedings of the Sixth International Workshop
on Network and Operating System Support for Digital Audio and Video, 1996.
9. Y. N. Doganata and A. N. Tantawi, Making a cost-effective video server, IEEE Multimedia 1, 1994, 22–30.
10. Y. Doganata and A. Tantawi, A cost/performance study of video servers with hierarchical storage, in 1st
International Conference on Multimedia Computing and Systems, 1994.
11. T. D. C. Little and D. Venkatesh, Probability-based assignment of videos to storage devices in a video-ondemand system, ACM/Springer Multimedia System 2, 1994, 280–287.
12. R. Tewari, D. Dias, R. Mukherjee, and H. Vin, High Availability in Clustered Video Server, Technical Report
RC 20108, IBM Research Division, T. J. Watson Research Center, June 1995.
13. R. Tewari, R. Mukherjee, D. Dias, and H. Vin, Design and performance tradeoffs in clustered video servers,
in 3rd IEEE International Conference on Multimedia Computing and Systems, 1996, 144–150.
14. A. Dan, M. Kienzle, and D. Sitaram, A dynamic policy of segment replication for load balancing in videoon-demand servers, ACM/Springer Multimedia System 3, 1995, 93–103.
15. D. N. Serpanos, L. Georagiadis, and T. Bouloutas, Mmpacking: A Load and Storage Balancing Algorithm for
Distributed Multimedia Servers, Technical Report RC 20410, IBM Research Division, T. J. Watson Research
Center, March 1996.
16. J. L. Wolf, P. S. Yu, and H. Shachnai, Disk load balancing for video-on-demand systems, ACM/Springer
Multimedia System 5, 1997, 358–370.
17. O. P. Damani, P. Y. Chung, Y. Huang, C. M. Kintala, and Y. M. Wang, One-ip: Techniques for hosting a
service on a cluster of machines, in Sixth International World Wide Web Conference, 1997.
18. J. Y. Lee, Parallel video servers: A tutorial, IEEE Multimedia April/June, 1998, 20–28.
19. M. Kienzle, A. Dan, D. Sitaram, and W. Tetzlaff, The effect of video server topology on contingency capacity
requirements, in Proceedings of Multimedia Computing and Networking, 1996.
LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER
217
20. T. Ibarkai and N. Katoh, Resource Allocation Problems—Algorithmic Approaches, MIT Press, Cambridge,
MA, 1988.
21. A. Federgruem and H. Groenevelt, The greedy procedure for resource allocation problem: Necessary and
sufficient conditions for optimality, Oper. Res. 34, 1986, 909–918.
22. A. Dan and D. Sitaram, An online video placement policy based on bandwidth to space ratio (bsr), in
Proceedings of SIGMOD’95, 1995.
23. G. Zipf, Human Behavior and the Principle of Least Effort, Addison-Wesley, Reading, MA, 1949.
SHIAO-LI TSAO received the B.S., M.S., and Ph.D. degrees in engineering science from National Cheng
Kung University, Tainan, Taiwan, in 1995, 1996, and 1999, respectively. His research interests include multimedia
system, computer network, operating system, and mobile computing. He has published about 30 international
journal/conference papers in the area of multimedia storage servers. He holds or has filed four U.S. patents in total.
MENG CHANG CHEN received the B.S. and M.S. degrees in computer science from National Chiao Tung
University, Taiwan in 1979 and 1981, respectively, and the Ph.D. degree in computer science from UCLA in 1989.
He joined AT&T Bell Labs in 1989 as a member of the technical staff. He became an associate professor in the
Department of Information Management, National Sun Yat-Sen University, Taiwan in 1992. Since 1993, he has
been with the Institute of Information Science, Academia Sinica, Taiwan. His current research interests include
multimedia systems and networking, QoS networking, operating systems, database and knowledge base systems,
and Internet documents management and access.
MING-TAT KO received the B.S. and M.S. degrees in mathematics from National Taiwan University in 1979
and 1982, respectively. He received a Ph.D. in computer science from National Tsing Hua University, Taiwan in
1988. Then he joined the Institute of Information Science as an associate research fellow. Dr. Ko’s major research
interest include the design and analysis of algorithms, computational geometry, graph algorithms, multimedia
systems, and computer graphics.
218
TSAO ET AL.
JAN-MING HO received his Ph.D. degree in electrical engineering and computer science from Northwestern
University in 1989. He received his B.S. in electrical engineering from National Cheng Kung University in 1978
and his M.S. at the Institute of electronics of National Chiao Tung University in 1980. He joined the Institute
of Information Science, Academia Sinica, Taiwan as an associate research fellow in 1989 and was promoted to
research fellow in 1994. He visited IBM T.J. Watson Research Center in summer 1987 and 1988, the Leonardo
Fibonacci Institute for the Foundations of Computer Science, Italy, in summer 1992, and the Dagstuhl-Seminar
on Combinatorial Methods for Integrated Circuit Design, IBFI-Geschäftsstelle, Schloss Dagstuhl, Fachbereich
Informatik, Bau 36, Universität des Saarlandes, Germany, in October 1993. His research interests include realtime
operating systems with applications to continuous media systems, e.g., video on demand and video conferencing,
computational geometry, combinatorial optimization, VLSI design algorithms, and implementation and testing of
VLSI algorithms on real designs.
YUEH-MIN HUANG received the B.S. degree in engineering science from the National Cheng-Kung University, Taiwan, in 1982, and the M.S. and the Ph.D. degrees in electrical engineering from the University of Arizona
in 1988 and 1991, respectively. He has been with the National Cheng-Kung University since 1991 and is currently
an associate professor of the Department of Engineering Science. His research interests include video-on-demand
system, real time operating systems, multimedia, and data mining.